Re: Unused files in the database directory after crashed VACUUM FULL

2019-02-10 Thread Thomas Munro
On Mon, Feb 11, 2019 at 10:21 AM Tom Lane  wrote:
> Ah, so Andrew was correct: we panicked due to lack of WAL space, and
> that explains why the vacuuming process didn't have an opportunity
> to delete the files belonging to the uncommitted new relation.

> It's a pretty well-understood dynamic, I believe.  Perhaps we should
> try harder to recover cleanly, but I don't know of anyone putting
> effort into the case.

FTR I am working on a PG13 patch that records relfilenodes of
uncommitted transactions in undo logs, so it can unlink them reliably,
even if you crash (at the cost of introducing a WAL flush before
creating files).  I haven't specifically studied the VACUUM FULL case
yet, but in principle this is exactly what my project aims to fix.
It's mostly intended as example code to demonstrate the undo log
machinery (defining undo record types, registering undo log action
functions that are invoked during rollback, rollback of aborted but
not yet rolled back transaction at startup, ...) without having to
understand the whole zheap sandwich at once, but it's also a solution
to an age old problem.  More on that soon.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Unused files in the database directory after crashed VACUUM FULL

2019-02-10 Thread Tom Lane
Hannes Erven  writes:
> Am 10.02.19 um 16:41 schrieb Tom Lane:
>> What do you mean by "crash" exactly?

> Here's the exact log (the pgadmin process was running the VACCUM FULL):
> 2019-02-09 23:44:53.516 CET [20341] @/ <> PANIC:  could not write to 
> file "pg_wal/xlogtemp.20341": No space left on device

Ah, so Andrew was correct: we panicked due to lack of WAL space, and
that explains why the vacuuming process didn't have an opportunity
to delete the files belonging to the uncommitted new relation.

If you're concerned about this sort of thing, you might consider putting
the WAL directory on a separate filesystem from the main database files.

>>> Is there a safe procedure how to
>>> check/clean up "unnecessary" files in the cluster directory?

>> You just described it --- verify with pg_filenode_relation that the
>> file doesn't correspond to any pg_class entry, then manually delete.

> OK, I see. I came up with this:

I think this isn't accounting for files in non-default
tablespaces, but maybe that's not a problem for your usage.

> Is this interesting enough to further investigate?

It's a pretty well-understood dynamic, I believe.  Perhaps we should
try harder to recover cleanly, but I don't know of anyone putting
effort into the case.

regards, tom lane



Re: Shared hosting with FDW on AWS RDS

2019-02-10 Thread Paul Jungwirth

On 2/10/19 2:57 PM, auxsvr wrote:

We'd like to configure an RDS server for shared hosting. The idea is that every 
customer will be using a different database and FDW will be configured, so that 
the remote tables have access to the full data


I've set up something like this before (but on EC2), and the only 
problem I couldn't solve was that any user can see your full customer 
list by typing `\l` or `\du`. They can't see other customers' stuff, but 
they can see how many customers you have and their database/login names. 
The only way around it I know is that run separate "clusters" aka RDS 
instances.


You can try to lock this down somewhat by revoking access to various 
system tables, but it starts breaking a lot of tools (e.g. some GUI 
tools don't know what to do if they get an error just listing the 
databases). Also it is so piecemeal I wouldn't trust that I'd blocked 
off all avenues of getting the information.


I'd love to be corrected on this btw if anyone has better information! :-)

--
Paul  ~{:-)
p...@illuminatedcomputing.com



Shared hosting with FDW on AWS RDS

2019-02-10 Thread auxsvr
Hi,

We'd like to configure an RDS server for shared hosting. The idea is that every 
customer will be using a different database and FDW will be configured, so that 
the remote tables have access to the full data, but materialized views will be 
pulling from them data specific to each customer. So far, everything seems to 
work fine and be secure, as we've revoked access to the remote tables for the 
customer users, but I'm feeling a bit uneasy considering that the credentials 
for full access are stored in each database. My understanding is that remote 
user mapping is designed so that this will not be an issue, but I was wondering 
if access to the metadata schema might allow to circumvent this restriction. 
Also, I was wondering if someone has experience hardening databases on RDS, as 
the so called superuser does not have the right to revoke access from the 
metadata schema.

Comments and suggestions are welcome.
-- 
Regards,
Peter





Re: Unused files in the database directory after crashed VACUUM FULL

2019-02-10 Thread Hannes Erven

Hi again,


Am 10.02.19 um 16:41 schrieb Tom Lane:

Hannes Erven  writes:

I've just had a "VACUUM FULL " crash due to 100% disk usage.
Clearly my fault, I was expecting the new table to be small enough.


What do you mean by "crash" exactly?  A normal transactional failure
should've cleaned up orphaned files.  I suppose if the kernel decided to
kill -9 the vacuum process, that wouldn't happen --- but that's not
the usual response to out-of-disk-space.



Here's the exact log (the pgadmin process was running the VACCUM FULL):

2019-02-09 23:44:53.516 CET [20341] @/ <> PANIC:  could not write to 
file "pg_wal/xlogtemp.20341": No space left on device
2019-02-09 23:44:53.516 CET [19181] @/ <> LOG:  WAL writer process (PID 
20341) was terminated by signal 6: Aborted
2019-02-09 23:44:53.516 CET [19181] @/ <> LOG:  terminating any other 
active server processes
2019-02-09 23:44:53.516 CET [20816] postgres@IP/db  
WARNING:  terminating connection because of crash of another serve

r process
2019-02-09 23:44:53.516 CET [20816] postgres@IP/db  
DETAIL:  The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process 
exited abnormally and possibly corrupted shared memory.
2019-02-09 23:44:53.516 CET [20816] postgres@IP/db  
HINT:  In a moment you should be able to reconnect to the database

 and repeat your command.
2019-02-09 23:44:53.597 CET [19181] @/ <> LOG:  all server processes 
terminated; reinitializing
2019-02-09 23:44:53.679 CET [11686] @/ <> LOG:  database system was 
interrupted; last known up at 2019-02-09 23:44:40 CET
2019-02-09 23:44:58.761 CET [11686] @/ <> LOG:  database system was not 
properly shut down; automatic recovery in progress

2019-02-09 23:44:58.807 CET [11686] @/ <> LOG:  redo starts at 1DA/B3400798
2019-02-09 23:44:58.811 CET [11686] @/ <> FATAL:  could not extend file 
"base/16400/612716_fsm": No space left on device

2019-02-09 23:44:58.811 CET [11686] @/ <> HINT:  Check free disk space.
2019-02-09 23:44:58.811 CET [11686] @/ <> CONTEXT:  WAL redo at 
1DA/B3406310 for Heap/INSERT: off 4



After freeing space and restarting:

2019-02-10 06:28:25.622 CET [14344] @/ <> LOG:  database system was 
interrupted while in recovery at 2019-02-09 23:44:58 CET
2019-02-10 06:28:25.622 CET [14344] @/ <> HINT:  This probably means 
that some data is corrupted and you will have to use the last backup for 
recovery.
2019-02-10 06:28:26.762 CET [14344] @/ <> LOG:  database system was not 
properly shut down; automatic recovery in progress

2019-02-10 06:28:26.831 CET [14344] @/ <> LOG:  redo starts at 1DA/B3400798
2019-02-10 06:28:27.624 CET [14344] @/ <> LOG:  redo done at 1DA/D5FFF9D8



Is there a safe procedure how to
check/clean up "unnecessary" files in the cluster directory?


You just described it --- verify with pg_filenode_relation that the
file doesn't correspond to any pg_class entry, then manually delete.



OK, I see. I came up with this:

with d0 as (
select oid from pg_database where datname=current_database()
), d1 as (
select pg_ls_dir as fn,
  regexp_match(pg_ls_dir, '^([0-9]+)(.*)$') as match
from d0, pg_ls_dir('base/'||d0.oid)
order by 1
),
d2 as (
select fn, match[1] as base, match[2] as ext
from d1
where (fn NOT ILIKE 'pg_%')
),
d3 as (
select d.*,  pg_filenode_relation(0, base::oid) as relation
from d2 d
)

select fn, pg_size_pretty((pg_stat_file('base/'||d0.oid||'/'||fn)).size)
from d0, d3
where relation is null;



 fn | pg_size_pretty
+
 612713 | 25 MB
 612718 | 1024 MB
 612718.1   | 1024 MB
 612718.2   | 180 MB
 t11_621351 | 96 kB
 t11_621354 | 0 bytes
 t11_621356 | 8192 bytes
 t11_621357 | 160 kB
 t11_621360 | 0 bytes
 t11_621362 | 8192 bytes


So there's still 612713 and 612718* left to clean up; judging from the 
numbers and timestamps, they probably belong to other VACUUM FULLs I 
manually canceled in the pgadmin UI during my experiments.


Is this interesting enough to further investigate? If yes, I'll be happy 
to work on a test case.
In any case, I've now learned (how) to check for orphaned files if a 
database cluster's on-disk size is way off from the sums of 
pg_total_relation_size() + WAL...



Thanks again & best regards,

-hannes





Promote replica before being able to accept connections

2019-02-10 Thread Martín Fernández
Hello Everyone!

We have a 9.2 pg cluster and we are in the process of rebuilding a master
database in our staging environment. In order to achieve the latter goal,
we are restoring our staging database using pg_basebackup against one of
our production replicas.

pg_basebackup has completed and the staging database is configured in
hot_standby mode, recovering from the WAL archive. The database has not
reached the point were it can accept connections yet, still investing WAL
files.

What we are not sure about is, can we attempt promoting the database as a
master instance before it can accept connections ? We don´t care if the
staging database is a few days behind the production environment, that is
something acceptable in our use case, we just want to promote it as soon as
we can.

Thanks in advance!

Best,
Martín


Re: Unused files in the database directory after crashed VACUUM FULL

2019-02-10 Thread Tom Lane
Hannes Erven  writes:
> I've just had a "VACUUM FULL " crash due to 100% disk usage.
> Clearly my fault, I was expecting the new table to be small enough.

What do you mean by "crash" exactly?  A normal transactional failure
should've cleaned up orphaned files.  I suppose if the kernel decided to
kill -9 the vacuum process, that wouldn't happen --- but that's not
the usual response to out-of-disk-space.

> After freeing up space, restarting the cluster and issuing another 
> VACCUM FULL, I noticed that the cluster was way bigger that it should be.
> In the base// folder, there was a large number of files with one 
> certain number that pg_filenode_relation() could not turn into a 
> relation. As that number was just a bit smaller that the 
> pg_relation_filepath() of the table I was working on, I guess these were 
> the remains of the failed VACUUM FULL operation?
> I removed those files and a VACCUM ANALYZE of the whole database went fine.

> So... is this the expected behaviour? Is there a safe procedure how to 
> check/clean up "unnecessary" files in the cluster directory?

You just described it --- verify with pg_filenode_relation that the
file doesn't correspond to any pg_class entry, then manually delete.

(There is a crash-recovery phase that does something similar, but
I think it's only designed to clean up temporary files.)

regards, tom lane



Unused files in the database directory after crashed VACUUM FULL

2019-02-10 Thread Hannes Erven

Hi,


I've just had a "VACUUM FULL " crash due to 100% disk usage.
Clearly my fault, I was expecting the new table to be small enough.

After freeing up space, restarting the cluster and issuing another 
VACCUM FULL, I noticed that the cluster was way bigger that it should be.
In the base// folder, there was a large number of files with one 
certain number that pg_filenode_relation() could not turn into a 
relation. As that number was just a bit smaller that the 
pg_relation_filepath() of the table I was working on, I guess these were 
the remains of the failed VACUUM FULL operation?

I removed those files and a VACCUM ANALYZE of the whole database went fine.

So... is this the expected behaviour? Is there a safe procedure how to 
check/clean up "unnecessary" files in the cluster directory?



Version:
('PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1) on x86_64-pc-linux-gnu, 
compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit')


Table definition:
(
id serial PRIMARY KEY,
data bytea,
parent_id integer NOT NULL
)
The "data" column values are large enough to be TOASTed.


Thanks & best regards,

-hannes