Re: simple reporting tools for postgres in aws

2020-12-08 Thread Alexandre Arruda
Em seg., 7 de dez. de 2020 às 12:53, Chris Stephens 
escreveu:

> Hello,
>
> We have a need to display some very simple reports generated from views in
> an rds postgres database. We need little more functionality beyond simple
> filtering, column hiding, sorting, grouping. Anything much beyond that
> would be a strike against the tool.
>
> i'm looking for something similar to oracle application
> express's interactive reports (i come from an oracle background).
>
> i took a look at what's available from amazon's aws marketplace but
> quickly got overwhelmed. every application i saw was overkill.
>
> users will not be tech savvy. we have a tentative fallback plan to use
> excel with pages linked to views in the database but i'm looking for
> something a little more targeted.
>
> does anyone have any suggestions that fit the description above?
>
> thanks for any input,
> chris
>

Hi,

If you need dashboards, with a lot of widgets (chat, tabular, etc), take a
look at this awesome tool called Redash:

https://redash.io

Or if you need reports like Crystal Reports, you can use PRD - Pentaho
Reports Designer:

https://community.hitachivantara.com/s/article/pentaho-reporting

And publish them in a Pentaho B.I. Server.

Best regards,

Alexandre


Re: recovery_target_time and WAL fetch with streaming replication

2019-03-19 Thread Alexandre Arruda
Sorry to revive this post, but I have the same problem:

I set a streaming replication slave with this steps:
1) create a physical replication slot in master
2) pg_basebackup -S slot (...)
3) create a recovery.conf with primary_conninfo, primary_slot_name and
recovery_min_apply_delay = '4d'
4) start replica and wal_receiver is started and write the wal files
5) after 4 days, I have a delayed replica and ALL wal files to apply a
recovery from 4 days until now

The problem is: if I restart the master or the replica, the
wal_receiver will not start and the replica will not receive
the new WAL files because all files in the pg_wal match the
recovery_min_apply_delay criteria.
Master will retain the WAL files (because the inactive slot and last
LSN), growing the space to save this wals.

But the real problem is if the master die: I have a replica with only
WAL files until restart.

To solve, I need to receive the WAL by pg_receivewal and use a
traditional cp in recovery.conf to apply them.
But this is an extra work to monitor and cleanup this WALs.

There is a way to start walreceiver independent from
recovery_min_apply_delay set ?

Best regards,

Alexandre







Em dom, 13 de mai de 2018 às 08:45, Michael Paquier
 escreveu:
>
> On Sun, May 13, 2018 at 09:42:42AM +0200, Hannes Erven wrote:
> > But when new WAL is needed, the standby will fetch /all/ WAL present on the
> > master.
>
> Fetching as much WAL as possible when recovery happens is wanted by
> design, so as it recovers as much as possible.  And that's documented.
>
> > I'd say, the standby should either:
> > - always connect to the primary and fetch any WAL present
>
> This is what a hot standby does.  It keeps waiting for WAL to become
> available whichever the source used (archive, local pg_xlog or stream)
> and switches between one or the other.  You can look at
> WaitForWALToBecomeAvailable to get an idea of the details.
>
> > - stop fetching/streaming WAL when it is not needed for the current
> > recovery_target
>
> The startup process is in charge of recovery (WAL replay and definition
> of from where to get the WAL available), and is the one which decides if
> using streaming is necessary or not.  if streaming is used, then it
> starts a WAL receiver.  If a switch from streaming to another WAL source
> (local pg_xlog or archives is done), then it shuts down the WAL
> receiver, consisting in sending SIGTERM to the WAL receiver and stopping
> it immediately with a FATAL message (stops process immediately).  The
> key point is that  WAL receiver is designed to be a light-weight
> transport layer for WAL data.  In short, to be simple, it receives a set
> of WAL bytes and writes them.  It does not include any logic to decode
> WAL records, so it cannot know when a stop point happens or not.  It
> also has no idea of the configuration within recovery.conf, which is
> loaded by the startup process.
>
> > Yes, but thats far less simple than just setting restore_target_time .
>
> It seems to me that archiving provides the control you are looking for.
> --
> Michael



Re: ERROR: found multixact from before relminmxid

2018-11-07 Thread Alexandre Arruda
The best solution that I have found is kick all connections and execute a
select for update to /dev/null in the affected tables, i.e.:

psql -o /dev/null -c "select * from table for update" database

After this, the vacuum is executed w/o problems again.

Best regards,

Alexandre

Em qua, 7 de nov de 2018 às 10:05, Adrien NAYRAT 
escreveu:

> On 11/5/18 5:41 PM, Adrien NAYRAT wrote:
> > We "solved" with a dump/restore on another server, also we kept previous
> > cluster to investigate. I want to be sure we encountered the bug solved
> > in 9.6.9 and it is not a new one.
>
> FYI, I am not sure I will be able to keep previous cluster many days.
>
> >
> > If we confirm it, maybe we should advise users to perform integrity
> > check? I was surprised 9.6.9 avoid new appearance of corruption but
> > nothing to ensure if database is already corrupted.
>
> If I understand correctly, it seems related to vacuum skipping page
> already frozen. Maybe we should advise to do a VACUUM FREEZE with
> DISABLE_PAGE_SKIPPING option?
>
> Regards,
>
>


Re: ERROR: found multixact from before relminmxid

2018-06-05 Thread Alexandre Arruda
Em seg, 28 de mai de 2018 às 16:44, Andres Freund  escreveu:
>
> Hi,
>
> I think I found the bug, and am about to post a fix for it belo
> https://postgr.es/m/20180525203736.crkbg36muzxrj...@alap3.anarazel.de.
>
> Greetings,
>
> Andres Freund

Hi Andres,

In end of April we did a complete dump/reload in database to version 10.3.
Today, the problem returns:

production=# vacuum verbose co27t;
INFO:  vacuuming "public.co27t"
ERROR:  found multixact 81704071 from before relminmxid 107665371
production=# vacuum full verbose co27t;
INFO:  vacuuming "public.co27t"
ERROR:  found multixact 105476076 from before relminmxid 107665371
production=# cluster co27t;
ERROR:  found multixact 105476076 from before relminmxid 107665371

But this time, regular vacuum versus full/cluster are different in
multixact number.
Your patch is applicable to this issue and is in 10.4 ?

Best regards,

Alexandre



Re: ERROR: found multixact from before relminmxid

2018-04-11 Thread Alexandre Arruda
2018-04-10 19:53 GMT-03:00 Andres Freund :
> On 2018-04-10 08:31:20 -0300, Alexandre Arruda wrote:
>> 2018-04-09 23:51 GMT-03:00 Peter Geoghegan :
>> > On Mon, Apr 9, 2018 at 6:56 PM, Alexandre Arruda  
>> > wrote:
>> >> (... and all other indexes returns null too)
>> >>
>> >> I tried with bt_index_check too. Same results.
>> >
>> > That's interesting, because it tells me that you have a table that
>> > appears to not be corrupt, despite the CLUSTER error. Also, the error
>> > itself comes from sanity checking added to MultiXact freezing fairly
>> > recently, in commit 699bf7d0.
>> >
>> > You didn't say anything about regular VACUUM being broken. Do you find
>> > that it works without any apparent issue?
>> >
>> > I have a suspicion that this could be a subtle bug in
>> > CLUSTER/freezing. The only heap_freeze_tuple() caller is code used by
>> > CLUSTER, so it's not that hard to imagine a MultiXact freezing bug
>> > that is peculiar to CLUSTER. Though I haven't thought about it in much
>> > detail.
>> >
>> > --
>> > Peter Geoghegan
>>
>> Hi Peter,
>>
>> Actualy, I first notice the problem in logs by autovacuum:
>>
>> 2018-04-10 08:22:15.385 -03 [55477] CONTEXT:  automatic vacuum of
>> table "production.public.fn06t"
>> 2018-04-10 08:22:16.815 -03 [55477] ERROR:  found multixact 68834765
>> from before relminmxid 73262006
>>
>> production=# vacuum analyze verbose fn06t;
>> INFO:  vacuuming "public.fn06t"
>> ERROR:  found multixact 76440919 from before relminmxid 122128619
>
> What does the function in
> https://www.postgresql.org/message-id/20180319181723.ugaf7hfkluqyo...@alap3.anarazel.de
> say about your table?
>
> Could you post pg_controldata output and
> SELECT * FROM pg_class WHERE oid = 'public.fn06t'::regclass;
> ?
>
> Greetings,
>
> Andres Freund

Hi Andres,

I install pageinspect and create the function, but it returns is null:


select * from check_rel('fn06t') limit 100;
blockno | lp | xmin
-++--
(0 rows)

Other affected table:

vacuum verbose fn06t4;
INFO:  vacuuming "public.fn06t4"
ERROR:  found multixact 68834765 from before relminmxid 73262006

production=# select * from check_rel('fn06t4') limit 100;
blockno | lp | xmin
-++--
(0 rows)

Best regards,

Alexandre



Re: ERROR: found multixact from before relminmxid

2018-04-10 Thread Alexandre Arruda
2018-04-10 19:09 GMT-03:00 Peter Geoghegan :
> On Tue, Apr 10, 2018 at 4:31 AM, Alexandre Arruda  wrote:
>> Actualy, I first notice the problem in logs by autovacuum:
>>
>> 2018-04-10 08:22:15.385 -03 [55477] CONTEXT:  automatic vacuum of
>> table "production.public.fn06t"
>> 2018-04-10 08:22:16.815 -03 [55477] ERROR:  found multixact 68834765
>> from before relminmxid 73262006
>>
>> production=# vacuum analyze verbose fn06t;
>> INFO:  vacuuming "public.fn06t"
>> ERROR:  found multixact 76440919 from before relminmxid 122128619
>
> Do you think that CLUSTER was run before regular VACUUM/autovacuum
> showed this error, though?

Yes, because the table is clustered in the old database and
reclustered when it was reloaded in the version 10.
But the table was not clustered again.

> Have you noticed any data loss? Things look okay when you do your dump
> + restore, right? The problem, as far as you know, is strictly that
> CLUSTER + VACUUM refuse to finish/raise these multixactid errors?

I did not see, apparently, any data loss. A dump/reload in fresh db
shows the same data os production. I verify this by md5sum in
genereted files created by a select statment (ordered by pkey) in both
databases (prod and test).
The problem is with regular vaccum that not complete. Cluster or
vacuum FULL is a try to correct the problem.

But I'm affraid if any data can be corrupted along time.



Re: ERROR: found multixact from before relminmxid

2018-04-10 Thread Alexandre Arruda
2018-04-10 19:53 GMT-03:00 Andres Freund :
> On 2018-04-10 08:31:20 -0300, Alexandre Arruda wrote:
>> 2018-04-09 23:51 GMT-03:00 Peter Geoghegan :
>> > On Mon, Apr 9, 2018 at 6:56 PM, Alexandre Arruda  
>> > wrote:
>> >> (... and all other indexes returns null too)
>> >>
>> >> I tried with bt_index_check too. Same results.
>> >
>> > That's interesting, because it tells me that you have a table that
>> > appears to not be corrupt, despite the CLUSTER error. Also, the error
>> > itself comes from sanity checking added to MultiXact freezing fairly
>> > recently, in commit 699bf7d0.
>> >
>> > You didn't say anything about regular VACUUM being broken. Do you find
>> > that it works without any apparent issue?
>> >
>> > I have a suspicion that this could be a subtle bug in
>> > CLUSTER/freezing. The only heap_freeze_tuple() caller is code used by
>> > CLUSTER, so it's not that hard to imagine a MultiXact freezing bug
>> > that is peculiar to CLUSTER. Though I haven't thought about it in much
>> > detail.
>> >
>> > --
>> > Peter Geoghegan
>>
>> Hi Peter,
>>
>> Actualy, I first notice the problem in logs by autovacuum:
>>
>> 2018-04-10 08:22:15.385 -03 [55477] CONTEXT:  automatic vacuum of
>> table "production.public.fn06t"
>> 2018-04-10 08:22:16.815 -03 [55477] ERROR:  found multixact 68834765
>> from before relminmxid 73262006
>>
>> production=# vacuum analyze verbose fn06t;
>> INFO:  vacuuming "public.fn06t"
>> ERROR:  found multixact 76440919 from before relminmxid 122128619
>
> What does the function in
> https://www.postgresql.org/message-id/20180319181723.ugaf7hfkluqyo...@alap3.anarazel.de
> say about your table?
>
> Could you post pg_controldata output and
> SELECT * FROM pg_class WHERE oid = 'public.fn06t'::regclass;
> ?
>
> Greetings,
>
> Andres Freund

pg_control version number:1002
Catalog version number:   201707211
Database system identifier:   6513284223567708119
Database cluster state:   in production
pg_control last modified: Ter 10 Abr 2018 23:51:56 BRT
Latest checkpoint location:   183/E4849618
Prior checkpoint location:183/E19F8A80
Latest checkpoint's REDO location:183/E202C430
Latest checkpoint's REDO WAL file:0001018300E2
Latest checkpoint's TimeLineID:   1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:  0:494263452
Latest checkpoint's NextOID:  1816390854
Latest checkpoint's NextMultiXactId:  256298856
Latest checkpoint's NextMultiOffset:  529468697
Latest checkpoint's oldestXID:10338
Latest checkpoint's oldestXID's DB:   866864162
Latest checkpoint's oldestActiveXID:  494263450
Latest checkpoint's oldestMultiXid:   73262006
Latest checkpoint's oldestMulti's DB: 866864162
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:Ter 10 Abr 2018 23:50:14 BRT
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline:   0
Backup start location:0/0
Backup end location:  0/0
End-of-backup record required:no
wal_level setting:replica
wal_log_hints setting:off
max_connections setting:  600
max_worker_processes setting: 8
max_prepared_xacts setting:   0
max_locks_per_xact setting:   64
track_commit_timestamp setting:   off
Maximum data alignment:   8
Database block size:  8192
Blocks per segment of large relation: 131072
WAL block size:   8192
Bytes per WAL segment:16777216
Maximum length of identifiers:64
Maximum columns in an index:  32
Maximum size of a TOAST chunk:1996
Size of a large-object chunk: 2048
Date/time type storage:   64-bit integers
Float4 argument passing:  by value
Float8 argument passing:  by value
Data page checksum version:   0
Mock authentication nonce:
a8eb07ea85c4bbeaf202020d036277b276bda47ef55c1456723ec2b3c40386b1


SELECT * FROM pg_class WHERE oid = 'public.fn06t'::regclass;
-[ RECORD 1 ]---+--
relname | fn06t
relnamespace| 2200
reltype | 866874136
reloftype   | 0
relowner| 10
relam   | 0
relfilenode | 1092835324

Re: ERROR: found multixact from before relminmxid

2018-04-10 Thread Alexandre Arruda
2018-04-09 23:51 GMT-03:00 Peter Geoghegan :
> On Mon, Apr 9, 2018 at 6:56 PM, Alexandre Arruda  wrote:
>> (... and all other indexes returns null too)
>>
>> I tried with bt_index_check too. Same results.
>
> That's interesting, because it tells me that you have a table that
> appears to not be corrupt, despite the CLUSTER error. Also, the error
> itself comes from sanity checking added to MultiXact freezing fairly
> recently, in commit 699bf7d0.
>
> You didn't say anything about regular VACUUM being broken. Do you find
> that it works without any apparent issue?
>
> I have a suspicion that this could be a subtle bug in
> CLUSTER/freezing. The only heap_freeze_tuple() caller is code used by
> CLUSTER, so it's not that hard to imagine a MultiXact freezing bug
> that is peculiar to CLUSTER. Though I haven't thought about it in much
> detail.
>
> --
> Peter Geoghegan

Hi Peter,

Actualy, I first notice the problem in logs by autovacuum:

2018-04-10 08:22:15.385 -03 [55477] CONTEXT:  automatic vacuum of
table "production.public.fn06t"
2018-04-10 08:22:16.815 -03 [55477] ERROR:  found multixact 68834765
from before relminmxid 73262006

production=# vacuum analyze verbose fn06t;
INFO:  vacuuming "public.fn06t"
ERROR:  found multixact 76440919 from before relminmxid 122128619



Re: ERROR: found multixact from before relminmxid

2018-04-09 Thread Alexandre Arruda
2018-04-09 22:16 GMT-03:00 Peter Geoghegan :
> On Mon, Apr 9, 2018 at 5:55 PM, Alexandre Arruda  wrote:
>> I ran amcheck in all index of a table and I only get empty returns.
>
> Did you try doing so with the "heapallindexed" option? That's what's
> really interesting here.
>
> --
> Peter Geoghegan

production=# select bt_index_parent_check('fn06t_pkey',true);

 bt_index_parent_check

---

(1 row)


production=# select bt_index_parent_check('ifn06t1',true);

 bt_index_parent_check

---

(1 row)


production=# select bt_index_parent_check('ifn06t4',true);

 bt_index_parent_check

---

(1 row)

(... and all other indexes returns null too)

I tried with bt_index_check too. Same results.

regards,

Alexandre



Re: ERROR: found multixact from before relminmxid

2018-04-09 Thread Alexandre Arruda
2018-04-09 18:23 GMT-03:00 Peter Geoghegan :
>
> On Mon, Apr 9, 2018 at 7:01 AM, Tomas Vondra
>  wrote:
> > The bigger question is whether this can actually detect the issue. If
> > it's due to an storage issue, then perhaps yes. But if you only see
> > multixact issues consistently and nothing else, it might easily be a
> > PostgreSQL bug (in which case the checksum will be correct).
>
> You can also run amcheck. Get the version targeting earlier Postgres
> releases off Github (there are packages for most Linux systems). This
> can verify that the heap is consistent with indexes.
>
> --
> Peter Geoghegan


Hi Peter,

I ran amcheck in all index of a table and I only get empty returns.

I did some tests:

production=# reindex table fn06t;
REINDEX
production=# vacuum verbose fn06t;
INFO:  vacuuming "public.fn06t"
ERROR:  found multixact 76440919 from before relminmxid 122128619

production=# select count(*),sum(fn06vrtitu) from fn06t;
 count  |   sum
-+-
2592363 | 4833603148.1172
(1 row)

#createdb freshdb
#pg_dump -t fn06t production | psql freshdb

freshdb=# select count(*),sum(fn06vrtitu) from fn06t;
 count  |   sum
-+-
2592363 | 4833603148.1172
(1 row)

psql -c "select * from fn06t order by " production >
multixact_error_fn06t.txt
psql -c "select * from fn06t order by " freshdb > freshdb_fn06t.txt

diff points no differences in files and md5sum produces the same hash.

The question is: if a dump can read the table "correctly", why a
CLUSTER/vacuum full stops while reading them?

Best regards,

Alexandre



Re: ERROR: found multixact from before relminmxid

2018-04-09 Thread Alexandre Arruda
2018-04-06 13:11 GMT-03:00 Tomas Vondra :

>
>
> On 04/06/2018 04:29 PM, Alexandre Arruda wrote:
> > 2018-04-06 9:39 GMT-03:00 Tomas Vondra  > <mailto:tomas.von...@2ndquadrant.com>>:
> >
> >
> >
> > On 04/06/2018 02:09 AM, Alexandre Arruda wrote:
> > > Hi,
> > >
> > > Some time ago, I had this errors frequently showed in logs after
> some
> > > autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in this
> tables
> > > show the same and not complete the tasks (showed by some table
> bloat
> > > select).
> > > Then, I did a full dump/restore into a new version (10.2) and
> everything
> > > is ok for a couple of months. Now, I have errors like this again:
> > >
> > > db1=# cluster pc44t;
> > >
> > > ERROR:  found multixact 134100944 from before relminmxid 192042633
> > >
> > >
> > >
> > > Like before, the only way to make the errors to desapear is by
> > > dump/reload the whole table.
> > >
> > >
> > > Thanks for any help.
> > >
> >
> > That's going to be hard, unless you still have the cluster around.
> >
> > This surely seems like some sort of data corruption issue, but
> without
> > being able to inspect the data pages it's nearly impossible to
> determine
> > what went wrong.
> >
> > We'd also need more information about what happened to the hardware
> and
> > cluster before the issues started to appear - crashes, hardware
> issues.
> > And so on.
> >
> > regards
> >
> > --
> > Tomas Vondra  http://www.2ndQuadrant.com
> > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> >
> >
> > Hi Tomas,
> > The old cluster are gone, unfortunatly.
> >
> > This server is a 128GB ECC RAM with a dual redundant hba fiber channel
> > connect to a sotorage with Raid 6 and I don't have (apparently) any
> > errors reported.
> > Yesterday I did a test with one table:  some sum aggragates, count(*),
> > etc, then dump/reload and repeat the tests the results (of querys) are
> > the same, regarding the vacuum problem
> > thats disapeared.
> >
>
> I'm not sure I understand correctly. So you can reproduce the issue? If
> yes, how can you share the scripts/data you use (and how large is it)?
> If we could reproduce it locally, it would make the investigation much
> easier.
>
> BTW you mentioned you're using PostgreSQL 9.6 - which minor version,
> specifically?
>
>
> regards
>
>
Hi Tomas,

No, I can't reproduce. What I did is a simple way to "validate" the current
table data to see if a dump/reload
preserve them. Old postgresql was 9.6.5. The problem returns now in new
10.3 installation.

There is a way to correct this tables without a dump/reload ?

I'm thinking to reinstall cluster doing a initdb --data-checksums, but I'm
affraid about a severe performance impact.

Best regards,

Alexandre


Re: ERROR: found multixact from before relminmxid

2018-04-06 Thread Alexandre Arruda
2018-04-06 9:39 GMT-03:00 Tomas Vondra :

>
>
> On 04/06/2018 02:09 AM, Alexandre Arruda wrote:
> > Hi,
> >
> > Some time ago, I had this errors frequently showed in logs after some
> > autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in this tables
> > show the same and not complete the tasks (showed by some table bloat
> > select).
> > Then, I did a full dump/restore into a new version (10.2) and everything
> > is ok for a couple of months. Now, I have errors like this again:
> >
> > db1=# cluster pc44t;
> >
> > ERROR:  found multixact 134100944 from before relminmxid 192042633
> >
> >
> >
> > Like before, the only way to make the errors to desapear is by
> > dump/reload the whole table.
> >
> >
> > Thanks for any help.
> >
>
> That's going to be hard, unless you still have the cluster around.
>
> This surely seems like some sort of data corruption issue, but without
> being able to inspect the data pages it's nearly impossible to determine
> what went wrong.
>
> We'd also need more information about what happened to the hardware and
> cluster before the issues started to appear - crashes, hardware issues.
> And so on.
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Hi Tomas,
The old cluster are gone, unfortunatly.

This server is a 128GB ECC RAM with a dual redundant hba fiber channel
connect to a sotorage with Raid 6 and I don't have (apparently) any errors
reported.
Yesterday I did a test with one table:  some sum aggragates, count(*), etc,
then dump/reload and repeat the tests the results (of querys) are the same,
regarding the vacuum problem
thats disapeared.

best regards


ERROR: found multixact from before relminmxid

2018-04-05 Thread Alexandre Arruda
Hi,

Some time ago, I had this errors frequently showed in logs after some
autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in this tables
show the same and not complete the tasks (showed by some table bloat
select).
Then, I did a full dump/restore into a new version (10.2) and everything is
ok for a couple of months. Now, I have errors like this again:

db1=# cluster pc44t;

ERROR:  found multixact 134100944 from before relminmxid 192042633


Like before, the only way to make the errors to desapear is by dump/reload
the whole table.


Thanks for any help.


Best regards,


Alexandre