Re: pg_basebackup failed to read a file

2018-08-15 Thread Michael Paquier
On Tue, Aug 14, 2018 at 12:14:59PM -0400, Tom Lane wrote:
> That seems like a pretty expensive thing to do, if there are lots of
> files ... and you'd still end up failing, so it's not moving the ball
> very far.

Yeah, I would think that with many small relations it is going to have a
measurable performance impact if we scan the whole data directory a
second time.

> More generally, this seems closely related to bug #14999 [1]
> which concerned pg_rewind's behavior in the face of unexpected file
> permissions within the data directory.  We ended up not doing anything
> about that except documenting it, which I wasn't very satisfied with,
> but the costs of doing better seemed to exceed the benefits.

Please feel free to read the end of the thread about details on the
matter.  There are many things you could do, all have drawbacks.

> It'd be nice to have a more coherent theory about what needs to be copied
> or not, and not fail on files that could simply be ignored.  Up to now
> we've resisted having any centrally defined knowledge of what can be
> inside a PG data directory, but maybe that bullet needs to be bitten.

Yeah, I have not really come up with a nice idea yet, especially when
things sometimes move with custom files that some users have been
deploying, so I am not completely sure that we'd need to do something
anyway, nor that it is worth the trouble.  One saner strategy may be to
split your custom file into a directory out of the main data folder...
--
Michael


signature.asc
Description: PGP signature


Re: upgrading from pg 9.3 to 10

2018-08-15 Thread Michael Paquier
On Tue, Aug 14, 2018 at 04:15:12PM -0300, Martín Marqués wrote:
> I'd recommend testing with a clone of the server to verify that it works
> properly (not only pg_upgrade, but your application with the new version
> of postgres). Also to time the window you'll need and see if there are
> things to be aware of, like extensions which are upgraded and might
> break the upgrade.
> 
> Now if you are going to first jump to 9.4, I'd recommend using pglogical
> after getting to 9.4 and upgrade straight from 9.4 to 10 (always after
> testing your application against 10)

Well, pglogical has the advantage of reducing the downtime, which may
not matter depending on your application and you may be able to accept a
it of downtime, and pg_upgrade --link can be pretty quick at its job.
Test it before as --link is a no-return trip.
--
Michael


signature.asc
Description: PGP signature


Re: During promotion, new master tries to archive same segment twice

2018-08-15 Thread Adrian Klaver

On 08/15/2018 01:25 PM, Phil Endecott wrote:

Dear Experts,

Here is my latest issue with replication:

I have 3 systems, X, Y and Z.  Initially X is replicated to Y
and Y is replicated to Z; in each case the replication involves
log-shipping using archive_command and restore_command (via a
4th system called "backup") and then streaming.

The plan is to shut down X and to promote Y to be the new master.

I shut down X, and as expected Y complains that it con no longer
connect to it:

2018-08-15 15:10:58.785 UTC [617] LOG:  replication terminated by primary server
2018-08-15 15:10:58.785 UTC [617] DETAIL:  End of WAL reached on timeline 1 at 
7/E998.
2018-08-15 15:10:58.785 UTC [617] FATAL:  could not send end-of-streaming 
message to primary: no COPY in progress
scp: backup/postgresql/archivedir/0001000700E9: No such file or 
directory
2018-08-15 15:10:59.928 UTC [354] LOG:  invalid resource manager ID 22 at 
7/E998
2018-08-15 15:11:00.092 UTC [5856] FATAL:  could not connect to the primary 
server: FATAL:  the database system is shutting down

Note that in the middle of that, after streaming has failed it
tries to fetch the next WAL segment, E9, from the backup; this
fails because this segment doesn't exist.

Then I pg_ctl promote on Y:

2018-08-15 15:12:11.888 UTC [354] LOG:  received promote request
2018-08-15 15:12:11.888 UTC [354] LOG:  redo done at 7/E928
2018-08-15 15:12:11.888 UTC [354] LOG:  last completed transaction was at log 
time 2018-08-15 14:45:05.961153+00
scp: backup/postgresql/archivedir/0001000700E9: No such file or 
directory
scp: backup/postgresql/archivedir/0002.history: No such file or directory
2018-08-15 15:12:13.316 UTC [354] LOG:  selected new timeline ID: 2
2018-08-15 15:12:13.368 UTC [354] FATAL:  could not open file "recovery.conf": 
Permission denied

So in the process of exiting recovery mode it tries to rename
recovery.conf to recovery.done, but it fails to do so because
my recovery.conf was owned by root, not user postgres.  Oooops.
Perhaps it would be appropriate to check for that when it
initially reads recovery.conf?  Anyway, I fix the permissions
and restart it.  It starts in recovery mode and cannot connect
to the master, as expected.

2018-08-15 15:16:58.061 UTC [6036] LOG:  restored log file 
"0001000700E8" from archive
2018-08-15 15:16:58.108 UTC [6036] LOG:  redo starts at 7/E8678300
2018-08-15 15:16:58.731 UTC [6087] LOG:  archive command failed with exit code 1
2018-08-15 15:16:58.731 UTC [6087] DETAIL:  The failed archive command was: ssh backup 
test ! -f backup/postgresql/Y/0001000700E8 && scp 
pg_xlog/0001000700E8 backup:backup/postgresql/Y/0001000700E8
scp: backup/postgresql/archivedir/0001000700E9: No such file or 
directory


The above is not clear to me. My best guess:

1) The ssh backup ... is archiving WAL's from Y to Y specific archive on 
'backup'?


2) The scp: backup ... is the restore_command fetching archived WAL's 
that originated on X and where sent to backup/postgresql/archivedir/ ?


3) If 2) is true why is it showing up in the error for the archive command?

Can you show us your actual archive and restore commands?

More below.


2018-08-15 15:16:58.823 UTC [6036] LOG:  consistent recovery state reached at 
7/E998
2018-08-15 15:16:58.823 UTC [6036] LOG:  invalid resource manager ID 22 at 
7/E998
2018-08-15 15:16:58.824 UTC [6035] LOG:  database system is ready to accept 
read only connections
2018-08-15 15:16:58.855 UTC [6094] FATAL:  could not connect to the primary 
server: could not connect to server: Connection refused





I can now connect to the db and do read/write operations, but it
continues to try and fail to re-write segment E8 to its archive.

Eventually I manually delete E8 from the archive; it then
succeeds at writing it with the same content as before and seems to
continue to function normally.


Which archive?




Questions:

- Can all of this be blamed on the failure of the first attempt
to promote due to the wrong permissions on recovery.conf?

- Should my archive_command detect the case where it is asked to
write the same file again with the same contents, and report
success in that case?

- Is this a bug?


Thanks,

Phil.







--
Adrian Klaver
adrian.kla...@aklaver.com



Re: During promotion, new master tries to archive same segment twice

2018-08-15 Thread David Steele
Hi Phil,

On 8/15/18 4:25 PM, Phil Endecott wrote:
> 
> Questions:
> 
> - Can all of this be blamed on the failure of the first attempt 
> to promote due to the wrong permissions on recovery.conf?

Yes, it looks that way.

> - Should my archive_command detect the case where it is asked to 
> write the same file again with the same contents, and report 
> success in that case?

Yes.  pgBackRest has done this for years and it saves a *lot* of headaches.

> - Is this a bug?

I don't think so.  There are a number of cases where the same WAL
segment can be pushed more than once, especially after failures where
Postgres is not sure that the command completed.  The archive command
should handle this gracefully.

Regards,
-- 
-David
da...@pgmasters.net



Re: Odd Row Estimates in Query Plan (rows=75)

2018-08-15 Thread Don Seiler
On Wed, Aug 15, 2018 at 3:31 PM, Adrian Klaver 
wrote:

>
> lts.date_added  > '2017-07-14 11:13:05'
>
> and
>
> lts.date_gifted >= '2017-08-13 11:13:05'
> ?
>
> In other words one '>' and the other '>=' ?
>

The date_added filters were added just to use that index and with a broad
range, since there isn't a filter on date_gifted. You'll notice the
date_added range is 30 days but the date_gifted range is 3 hours. We really
only care about date_gifted but at this time there isn't an index on that
field.

Even as I experiment with some query rewrites, the EXPLAIN ANALYZE always
says rows=75. I'm *very* curious to see why it is using that value.

Don.

-- 
Don Seiler
www.seiler.us


Re: Odd Row Estimates in Query Plan (rows=75)

2018-08-15 Thread Adrian Klaver

On 08/15/2018 01:03 PM, Don Seiler wrote:

Here's the query, obfuscated manually by me:

SELECT
         'Foo' as system_function,
stores.name  as store,
         lt.owner,
         lt.minute_of_day,
         lt.records
         FROM
         foo.stores
         LEFT OUTER JOIN
             (SELECT
                 lts.store_pkey,
                 lts.owner,
                 date_trunc('minute', lts.date_gifted) as minute_of_day,
                 count(*) as records
             FROM foo.gifts lts
             WHERE
                 lts.date_added  > '2017-07-14 11:13:05'
             AND lts.date_added  < '2017-08-13 14:14:21'
             AND lts.date_gifted >= '2017-08-13 11:13:05'
             AND lts.date_gifted <  '2017-08-13 14:14:21'
             GROUP BY 1,2,3
             ORDER BY 1
             ) lt ON lt.store_pkey = stores.pkey
         WHERE lt.records IS NOT NULL;

The foo.gifts table is pretty much the core table of our database. It's 
big and very active. There is an index on date_added but not yet on 
date_gifted.


I'm working to re-write the query while the dev sees if we even need 
this query anymore.




I agree the issue seems to be in the index/filter of the dates. That 
leads me to another question:


Why in:

WHERE
lts.date_added  > '2017-07-14 11:13:05'
AND
lts.date_added  < '2017-08-13 14:14:21'
AND
lts.date_gifted >= '2017-08-13 11:13:05'
AND
lts.date_gifted <  '2017-08-13 14:14:21'

is

lts.date_added  > '2017-07-14 11:13:05'

and

lts.date_gifted >= '2017-08-13 11:13:05'
?

In other words one '>' and the other '>=' ?








--
Adrian Klaver
adrian.kla...@aklaver.com



During promotion, new master tries to archive same segment twice

2018-08-15 Thread Phil Endecott
Dear Experts,

Here is my latest issue with replication:

I have 3 systems, X, Y and Z.  Initially X is replicated to Y 
and Y is replicated to Z; in each case the replication involves 
log-shipping using archive_command and restore_command (via a 
4th system called "backup") and then streaming.

The plan is to shut down X and to promote Y to be the new master.

I shut down X, and as expected Y complains that it con no longer 
connect to it:

2018-08-15 15:10:58.785 UTC [617] LOG:  replication terminated by primary server
2018-08-15 15:10:58.785 UTC [617] DETAIL:  End of WAL reached on timeline 1 at 
7/E998.
2018-08-15 15:10:58.785 UTC [617] FATAL:  could not send end-of-streaming 
message to primary: no COPY in progress
scp: backup/postgresql/archivedir/0001000700E9: No such file or 
directory
2018-08-15 15:10:59.928 UTC [354] LOG:  invalid resource manager ID 22 at 
7/E998
2018-08-15 15:11:00.092 UTC [5856] FATAL:  could not connect to the primary 
server: FATAL:  the database system is shutting down

Note that in the middle of that, after streaming has failed it 
tries to fetch the next WAL segment, E9, from the backup; this 
fails because this segment doesn't exist.

Then I pg_ctl promote on Y:

2018-08-15 15:12:11.888 UTC [354] LOG:  received promote request
2018-08-15 15:12:11.888 UTC [354] LOG:  redo done at 7/E928
2018-08-15 15:12:11.888 UTC [354] LOG:  last completed transaction was at log 
time 2018-08-15 14:45:05.961153+00
scp: backup/postgresql/archivedir/0001000700E9: No such file or 
directory
scp: backup/postgresql/archivedir/0002.history: No such file or directory
2018-08-15 15:12:13.316 UTC [354] LOG:  selected new timeline ID: 2
2018-08-15 15:12:13.368 UTC [354] FATAL:  could not open file "recovery.conf": 
Permission denied

So in the process of exiting recovery mode it tries to rename 
recovery.conf to recovery.done, but it fails to do so because 
my recovery.conf was owned by root, not user postgres.  Oooops. 
Perhaps it would be appropriate to check for that when it 
initially reads recovery.conf?  Anyway, I fix the permissions 
and restart it.  It starts in recovery mode and cannot connect 
to the master, as expected.

2018-08-15 15:16:58.061 UTC [6036] LOG:  restored log file 
"0001000700E8" from archive
2018-08-15 15:16:58.108 UTC [6036] LOG:  redo starts at 7/E8678300
2018-08-15 15:16:58.731 UTC [6087] LOG:  archive command failed with exit code 1
2018-08-15 15:16:58.731 UTC [6087] DETAIL:  The failed archive command was: ssh 
backup test ! -f backup/postgresql/Y/0001000700E8 && scp 
pg_xlog/0001000700E8 
backup:backup/postgresql/Y/0001000700E8
scp: backup/postgresql/archivedir/0001000700E9: No such file or 
directory
2018-08-15 15:16:58.823 UTC [6036] LOG:  consistent recovery state reached at 
7/E998
2018-08-15 15:16:58.823 UTC [6036] LOG:  invalid resource manager ID 22 at 
7/E998
2018-08-15 15:16:58.824 UTC [6035] LOG:  database system is ready to accept 
read only connections
2018-08-15 15:16:58.855 UTC [6094] FATAL:  could not connect to the primary 
server: could not connect to server: Connection refused

But note that in the middle of that log fragment it has tried to 
write WAL segment E8 to its (output) WAL archive, which is used to 
replicate to the downstream system Z, and it has found that that 
segment already exists there.  The file that's already there has 
the same checksum as the E8 file in the pg_xlog directory.  
Somehow it has forgotten that it has already archived that segment. 
It keeps retrying this archive command.

I try again to promote it:

2018-08-15 15:17:14.424 UTC [6036] LOG:  received promote request
2018-08-15 15:17:14.424 UTC [6036] LOG:  redo done at 7/E928
scp: backup/postgresql/archivedir/0001000700E9: No such file or 
directory
scp: backup/postgresql/archivedir/0002.history: No such file or directory
2018-08-15 15:17:15.832 UTC [6036] LOG:  selected new timeline ID: 2
2018-08-15 15:17:15.884 UTC [6036] LOG:  archive recovery complete
scp: backup/postgresql/archivedir/0001.history: No such file or directory
2018-08-15 15:17:16.612 UTC [6036] LOG:  MultiXact member wraparound 
protections are now enabled
2018-08-15 15:17:16.614 UTC [6035] LOG:  database system is ready to accept 
connections

I can now connect to the db and do read/write operations, but it 
continues to try and fail to re-write segment E8 to its archive.

Eventually I manually delete E8 from the archive; it then 
succeeds at writing it with the same content as before and seems to 
continue to function normally.

Questions:

- Can all of this be blamed on the failure of the first attempt 
to promote due to the wrong permissions on recovery.conf?

- Should my archive_command detect the case where it is asked to 
write the same file again with the same contents, and report 
success in that case?

- Is this a bug?


Thanks,

Phil.





Re: Odd Row Estimates in Query Plan (rows=75)

2018-08-15 Thread Don Seiler
Here's the query, obfuscated manually by me:

SELECT
'Foo' as system_function,
stores.name as store,
lt.owner,
lt.minute_of_day,
lt.records
FROM
foo.stores
LEFT OUTER JOIN
(SELECT
lts.store_pkey,
lts.owner,
date_trunc('minute', lts.date_gifted) as minute_of_day,
count(*) as records
FROM foo.gifts lts
WHERE
lts.date_added  > '2017-07-14 11:13:05'
AND lts.date_added  < '2017-08-13 14:14:21'
AND lts.date_gifted >= '2017-08-13 11:13:05'
AND lts.date_gifted <  '2017-08-13 14:14:21'
GROUP BY 1,2,3
ORDER BY 1
) lt ON lt.store_pkey = stores.pkey
WHERE lt.records IS NOT NULL;

The foo.gifts table is pretty much the core table of our database. It's big
and very active. There is an index on date_added but not yet on
date_gifted.

I'm working to re-write the query while the dev sees if we even need this
query anymore.

On Wed, Aug 15, 2018 at 2:39 PM, Adrian Klaver 
wrote:

> On 08/15/2018 12:31 PM, Don Seiler wrote:
>
>> PostgreSQL 9.6.6 on CentOS.
>>
>> We have a report query that has gone from maybe a few seconds to run to a
>> few minutes to run since mid-July. Looking at the output of EXPLAIN
>> ANALYZE, the row count estimates are way off, even though this table was
>> just analyzed a day or so ago. What's more bizarre to me is that the row
>> count esimate is *always* 75 for every node of the plan, where the actual
>> rows is in the hundreds or thousands. This table is one of the busiest
>> tables in our production database (many inserts and updates). It is
>> autovacuumed and autoanalyzed a few times per week, although I'm looking to
>> change it to a nightly manual schedule to avoid daytime autovacuums.
>>
>> Hash Join  (cost=1869142.34..1869146.15 rows=75 width=88) (actual
>> time=179877.869..179878.011 rows=759 loops=1)
>> Hash Cond: (stores.pkey = lt.store_pkey)
>> Buffers: shared hit=1654593 read=331897 dirtied=249
>> ->  Seq Scan on stores  (cost=0.00..2.77 rows=77 width=22) (actual
>> time=0.007..0.023 rows=78 loops=1)
>>   Buffers: shared hit=2
>> ->  Hash  (cost=1869141.40..1869141.40 rows=75 width=50) (actual
>> time=179877.847..179877.847 rows=759 loops=1)
>>   Buckets: 1024  Batches: 1  Memory Usage: 73kB
>>   Buffers: shared hit=1654591 read=331897 dirtied=249
>>   ->  Subquery Scan on lt  (cost=1869138.59..1869141.40 rows=75
>> width=50) (actual time=179875.976..179877.697 rows=759 loops=1)
>> Buffers: shared hit=1654591 read=331897 dirtied=249
>> ->  GroupAggregate  (cost=1869138.59..1869140.65 rows=75
>> width=50) (actual time=179875.976..179877.606 rows=759 loops=1)
>>   Group Key: lts.store_pkey, lts.owner,
>> (date_trunc('minute'::text, lts.date_gifted))
>>   Filter: (count(*) IS NOT NULL)
>>   Buffers: shared hit=1654591 read=331897 dirtied=249
>>   ->  Sort  (cost=1869138.59..1869138.78 rows=75
>> width=42) (actual time=179875.961..179876.470 rows=6731 loops=1)
>> Sort Key: lts.store_pkey,
>> lts.entry_source_owner, (date_trunc('minute'::text, lts.date_gifted))
>> Sort Method: quicksort  Memory: 757kB
>> Buffers: shared hit=1654591 read=331897
>> dirtied=249
>> ->  Index Scan using gifts_date_added on
>> gifts lts  (cost=0.56..1869136.25 rows=75 width=42) (actual
>> time=190.657..179870.165 rows=6731 loops=1)
>>   Index Cond: ((date_added > '2018-07-14
>> 11:13:05'::timestamp without time zone) AND (date_added < '2018-08-13
>> 14:14:21'::timestamp without time zone))
>>   Filter: ((date_gifted >= '2018-08-13
>> 11:13:05'::timestamp without time zone) AND (date_gifted < '2018-08-13
>> 14:14:21'::timestamp without time zone))
>>   Rows Removed by Filter: 938197
>>   Buffers: shared hit=1654591 read=331897
>> dirtied=249
>>   Planning time: 0.426 ms
>>   Execution time: 179893.894 ms
>>
>> I don't have a version of this query from prior to this summer, but
>> getting explain plan for older data from older sandboxes show a similar
>> plan.
>>
>
> I don't have an answer, just a question:
>
> Can you provide the actual query and the table schema?
>
>
>> Sidenote: I am suggesting that an index be added on the date_gifted field
>> as that is far more selective and avoids throwing rows away. However I'm
>> very interested in why every node dealing with the gifts table thinks
>> rows=75 when the actual is much, much higher. And 75 seems like too round
>> of a number to be random?
>>
>> --
>> Don Seiler
>> www.seiler.us 
>>
>
>
> 

Re: Odd Row Estimates in Query Plan (rows=75)

2018-08-15 Thread Adrian Klaver

On 08/15/2018 12:31 PM, Don Seiler wrote:

PostgreSQL 9.6.6 on CentOS.

We have a report query that has gone from maybe a few seconds to run to 
a few minutes to run since mid-July. Looking at the output of EXPLAIN 
ANALYZE, the row count estimates are way off, even though this table was 
just analyzed a day or so ago. What's more bizarre to me is that the row 
count esimate is *always* 75 for every node of the plan, where the 
actual rows is in the hundreds or thousands. This table is one of the 
busiest tables in our production database (many inserts and updates). It 
is autovacuumed and autoanalyzed a few times per week, although I'm 
looking to change it to a nightly manual schedule to avoid daytime 
autovacuums.


Hash Join  (cost=1869142.34..1869146.15 rows=75 width=88) (actual 
time=179877.869..179878.011 rows=759 loops=1)

    Hash Cond: (stores.pkey = lt.store_pkey)
    Buffers: shared hit=1654593 read=331897 dirtied=249
    ->  Seq Scan on stores  (cost=0.00..2.77 rows=77 width=22) (actual 
time=0.007..0.023 rows=78 loops=1)

          Buffers: shared hit=2
    ->  Hash  (cost=1869141.40..1869141.40 rows=75 width=50) (actual 
time=179877.847..179877.847 rows=759 loops=1)

          Buckets: 1024  Batches: 1  Memory Usage: 73kB
          Buffers: shared hit=1654591 read=331897 dirtied=249
          ->  Subquery Scan on lt  (cost=1869138.59..1869141.40 rows=75 
width=50) (actual time=179875.976..179877.697 rows=759 loops=1)

                Buffers: shared hit=1654591 read=331897 dirtied=249
                ->  GroupAggregate  (cost=1869138.59..1869140.65 rows=75 
width=50) (actual time=179875.976..179877.606 rows=759 loops=1)
                      Group Key: lts.store_pkey, lts.owner, 
(date_trunc('minute'::text, lts.date_gifted))

                      Filter: (count(*) IS NOT NULL)
                      Buffers: shared hit=1654591 read=331897 dirtied=249
                      ->  Sort  (cost=1869138.59..1869138.78 rows=75 
width=42) (actual time=179875.961..179876.470 rows=6731 loops=1)
                            Sort Key: lts.store_pkey, 
lts.entry_source_owner, (date_trunc('minute'::text, lts.date_gifted))

                            Sort Method: quicksort  Memory: 757kB
                            Buffers: shared hit=1654591 read=331897 
dirtied=249
                            ->  Index Scan using gifts_date_added on 
gifts lts  (cost=0.56..1869136.25 rows=75 width=42) (actual 
time=190.657..179870.165 rows=6731 loops=1)
                                  Index Cond: ((date_added > '2018-07-14 
11:13:05'::timestamp without time zone) AND (date_added < '2018-08-13 
14:14:21'::timestamp without time zone))
                                  Filter: ((date_gifted >= '2018-08-13 
11:13:05'::timestamp without time zone) AND (date_gifted < '2018-08-13 
14:14:21'::timestamp without time zone))

                                  Rows Removed by Filter: 938197
                                  Buffers: shared hit=1654591 
read=331897 dirtied=249

  Planning time: 0.426 ms
  Execution time: 179893.894 ms

I don't have a version of this query from prior to this summer, but 
getting explain plan for older data from older sandboxes show a similar 
plan.


I don't have an answer, just a question:

Can you provide the actual query and the table schema?



Sidenote: I am suggesting that an index be added on the date_gifted 
field as that is far more selective and avoids throwing rows away. 
However I'm very interested in why every node dealing with the gifts 
table thinks rows=75 when the actual is much, much higher. And 75 seems 
like too round of a number to be random?


--
Don Seiler
www.seiler.us 



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Code of Conduct plan

2018-08-15 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Unless there are substantial objections, or nontrivial changes as a result
>> of this round of comments, we anticipate making the CoC official as of
>> July 1 2018.

> We seem to be a bit past that timeline...  Do we have any update on when
> this will be moving forward?

> Or did I miss something?

Nope, you didn't.  Folks have been on holiday which made it hard to keep
forward progress going, particularly with respect to selecting the initial
committee members.  Now that Magnus is back on shore, I hope we can
wrap it up quickly --- say by the end of August.

regards, tom lane



Odd Row Estimates in Query Plan (rows=75)

2018-08-15 Thread Don Seiler
PostgreSQL 9.6.6 on CentOS.

We have a report query that has gone from maybe a few seconds to run to a
few minutes to run since mid-July. Looking at the output of EXPLAIN
ANALYZE, the row count estimates are way off, even though this table was
just analyzed a day or so ago. What's more bizarre to me is that the row
count esimate is *always* 75 for every node of the plan, where the actual
rows is in the hundreds or thousands. This table is one of the busiest
tables in our production database (many inserts and updates). It is
autovacuumed and autoanalyzed a few times per week, although I'm looking to
change it to a nightly manual schedule to avoid daytime autovacuums.

 Hash Join  (cost=1869142.34..1869146.15 rows=75 width=88) (actual
time=179877.869..179878.011 rows=759 loops=1)
   Hash Cond: (stores.pkey = lt.store_pkey)
   Buffers: shared hit=1654593 read=331897 dirtied=249
   ->  Seq Scan on stores  (cost=0.00..2.77 rows=77 width=22) (actual
time=0.007..0.023 rows=78 loops=1)
 Buffers: shared hit=2
   ->  Hash  (cost=1869141.40..1869141.40 rows=75 width=50) (actual
time=179877.847..179877.847 rows=759 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 73kB
 Buffers: shared hit=1654591 read=331897 dirtied=249
 ->  Subquery Scan on lt  (cost=1869138.59..1869141.40 rows=75
width=50) (actual time=179875.976..179877.697 rows=759 loops=1)
   Buffers: shared hit=1654591 read=331897 dirtied=249
   ->  GroupAggregate  (cost=1869138.59..1869140.65 rows=75
width=50) (actual time=179875.976..179877.606 rows=759 loops=1)
 Group Key: lts.store_pkey, lts.owner,
(date_trunc('minute'::text, lts.date_gifted))
 Filter: (count(*) IS NOT NULL)
 Buffers: shared hit=1654591 read=331897 dirtied=249
 ->  Sort  (cost=1869138.59..1869138.78 rows=75
width=42) (actual time=179875.961..179876.470 rows=6731 loops=1)
   Sort Key: lts.store_pkey,
lts.entry_source_owner, (date_trunc('minute'::text, lts.date_gifted))
   Sort Method: quicksort  Memory: 757kB
   Buffers: shared hit=1654591 read=331897
dirtied=249
   ->  Index Scan using gifts_date_added on gifts
lts  (cost=0.56..1869136.25 rows=75 width=42) (actual
time=190.657..179870.165 rows=6731 loops=1)
 Index Cond: ((date_added > '2018-07-14
11:13:05'::timestamp without time zone) AND (date_added < '2018-08-13
14:14:21'::timestamp without time zone))
 Filter: ((date_gifted >= '2018-08-13
11:13:05'::timestamp without time zone) AND (date_gifted < '2018-08-13
14:14:21'::timestamp without time zone))
 Rows Removed by Filter: 938197
 Buffers: shared hit=1654591 read=331897
dirtied=249
 Planning time: 0.426 ms
 Execution time: 179893.894 ms

I don't have a version of this query from prior to this summer, but getting
explain plan for older data from older sandboxes show a similar plan.

Sidenote: I am suggesting that an index be added on the date_gifted field
as that is far more selective and avoids throwing rows away. However I'm
very interested in why every node dealing with the gifts table thinks
rows=75 when the actual is much, much higher. And 75 seems like too round
of a number to be random?

-- 
Don Seiler
www.seiler.us


Re: Code of Conduct plan

2018-08-15 Thread Bruce Momjian
On Wed, Aug 15, 2018 at 03:22:10PM -0400, Stephen Frost wrote:
> Greetings,
> 
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > Unless there are substantial objections, or nontrivial changes as a result
> > of this round of comments, we anticipate making the CoC official as of
> > July 1 2018.
> 
> We seem to be a bit past that timeline...  Do we have any update on when
> this will be moving forward?
> 
> Or did I miss something?

Are we waiting for the conference community guidlines to be solidified?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: Code of Conduct plan

2018-08-15 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Unless there are substantial objections, or nontrivial changes as a result
> of this round of comments, we anticipate making the CoC official as of
> July 1 2018.

We seem to be a bit past that timeline...  Do we have any update on when
this will be moving forward?

Or did I miss something?

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: moving data to from ms sql server via SSIS

2018-08-15 Thread Joshua D. Drake

On 08/15/2018 11:03 AM, cosmin ioan wrote:

hi folks,
we're trying to phase out MS SQL Server and we're trying to understand 
the easiest way to move data from MS SQL Server to postgres, via SSIS, 
say via ODBC, OleDB or .Net drivers, and basically create new tables & 
move data, into the target postgres db.  What drivers would be the 
best/most flexible?


thanks much for any info or white papers on the topic,


A *lot* of this is: it depends

However, a pretty simple way is to use either FDWs from PostgreSQL or 
linked tables in MSSQL. It would be easy enough for example to have 
table ms.foo which has data and then table postgres.foo that is empty. 
If ms.foo is linked (or and FDW table) you can use standard SQL to do 
things like INSERT INTO (SELECT * FROM).


Of course if you can take the outage and you prep PostgreSQL 
appropriately you could also just dump out to csv or tab delimited files 
and load them with COPY.


JD


--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc
***  A fault and talent of mine is to tell it exactly how it is.  ***
PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://postgresconf.org
* Unless otherwise stated, opinions are my own.   *




moving data to from ms sql server via SSIS

2018-08-15 Thread cosmin ioan
hi folks,
we're trying to phase out MS SQL Server and we're trying to understand the
easiest way to move data from MS SQL Server to postgres, via SSIS, say via
ODBC, OleDB or .Net drivers, and basically create new tables & move data,
into the target postgres db.  What drivers would be the best/most flexible?

thanks much for any info or white papers on the topic,
Cos


Re: using graph model with PostgreSQL

2018-08-15 Thread Joshua D. Drake

On 08/15/2018 07:09 AM, 김세훈 wrote:

Hi there,

currently I'm using PostgreSQL with PostGIS extension to handle 
geospatial data.


In my project I need to apply some graph algorithms like MST for some 
network of GPS coordinates.


I know there is some way of using Neo4j with PostgreSQL but is there 
any other way to construct


graph model within PostgreSQL environment?


Check out AgensGraph from Bitnine. It's open source.



Any external modules would be welcomed.


Thanks.




--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc
***  A fault and talent of mine is to tell it exactly how it is.  ***
PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://postgresconf.org
* Unless otherwise stated, opinions are my own.   *




Re: How to revoke privileged from PostgreSQL's superuser

2018-08-15 Thread Bruce Momjian
On Tue, Aug 14, 2018 at 03:59:19PM -0400, Bruce Momjian wrote:
> On Fri, Aug 10, 2018 at 04:06:40PM -0400, Benedict Holland wrote:
> > I also would take Bruce's comment with a massive grain of salt. Everything 
> > that
> > everyone does on a database is logged somewhere assuming proper logging. 
> > Now do
> > you have the person-power to go through gigs of plain text logs to find out 
> > if
> > someone is doing something shady... that is a question for your management
> > team. Also, if you suspect someone of doing something shady, you should
> > probably revoke their admin rights. 
> 
> Agreed, the best way to limit the risk of undetected DBA removal of data
> is secure auditing --- I should have mentioned that.

So, how do you securely audit?  You ship the logs to a server that isn't
controlled by the DBA, via syslog?  How do you prevent the DBA from
turning off logging when the want to so something undetected?  Do you
log the turning off of logging?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



using graph model with PostgreSQL

2018-08-15 Thread 김세훈
Hi there,

currently I'm using PostgreSQL with PostGIS extension to handle geospatial
data.

In my project I need to apply some graph algorithms like MST for some
network of GPS coordinates.

I know there is some way of using Neo4j with PostgreSQL but is there any
other way to construct

graph model within PostgreSQL environment?

Any external modules would be welcomed.


Thanks.