subject:"Re\: \[GENERAL\] Help"

>
> > So I can switch to Custom format for future backups.  But regarding the
> > existing backups I have in Tar format, is there any way to successfully
> > restore them?
>
> FWIW, the business with making and editing a list file should work just
> fine with a tar-format dump, not only with a custom-format dump.  The
> metadata is all there in either case.
>

I had tried that originally, but got an error:

bash-4.1$ pg_restore -L spc_restore_list.tmp -d spc_test_1
agency_backup.spc.2017.06.05_10.30.01.tar

pg_restore: [tar archiver] restoring data out of order is not supported in
this archive format: "10608.dat" is required, but comes before "10760.dat"
in the archive file.

The pg_dump doc page kinda suggests but doesn't quite say that you can't
re-order tar files; between that and the error message I gave up on that
possibility.  Are you suggesting it should work?

https://www.postgresql.org/docs/9.3/static/app-pgdump.html

The alternative archive file formats must be used with pg_restore
 to rebuild
the database. They allow pg_restore to be selective about what is restored,
or even to reorder the items prior to being restored. The archive file
formats are designed to be portable across architectures.

When used with one of the archive file formats and combined with pg_restore
, pg_dump provides a flexible archival and transfer mechanism. pg_dump can
be used to backup an entire database, then pg_restore can be used to
examine the archive and/or select which parts of the database are to be
restored. *The most flexible output file formats are the "custom" format
(-Fc) and the "directory" format(-Fd). They allow for selection and
reordering* of all archived items, support parallel restoration, and are
compressed by default. The "directory" format is the only format that
supports parallel dumps.
Cheers,
Ken
-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ *
*https://agency-software.org/demo/client
*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
 to
learn more about AGENCY or
follow the discussion.

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

2017-06-05 Thread Tom Lane

Ken Tanzer  writes:
> ...The rest of the DB is fine, but tbl_payment has 0 rows.  I believe this is
> because tbl_payment has a constraint that calls a function has_perm() that
> relies on data in a couple of other tables, and that tbl_payment is being
> restored before those tables.  I was able to created a new dump in Custom
> format, reorder the List file, and restore that successfully.

> So I can switch to Custom format for future backups.  But regarding the
> existing backups I have in Tar format, is there any way to successfully
> restore them?

FWIW, the business with making and editing a list file should work just
fine with a tar-format dump, not only with a custom-format dump.  The
metadata is all there in either case.

As already noted, it's hard to get pg_dump/pg_restore to cope
automatically with hidden dependencies like what you have here.
The fact that those other tables would need to be restored first
simply isn't visible to pg_dump.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

On Mon, Jun 5, 2017 at 6:21 PM, Ken Tanzer  wrote:

> I do get the "make \d show relevant information" argument and that is one
>> that seems easier to solve...
>>
>
> Maybe I'm missing something, but I'm not sure how you'd solve this or
> change what \d shows for a table.  Right now I get to see this in my \d:
>
> "authorized_approvers_only" CHECK (approved_by IS NULL OR 
> has_perm(approved_by, 'APPROVE_PAYMENT'::character varying, 'W'::character
> varying))
>
> But when I move that to a trigger, I'll only see the trigger name.  Any
> while this procedure would be really short, others not so much, so you
> wouldn't really want to automatically display it inline.
>

FWIW

I wouldn't show the trigger functions but I'd show something like:

CREATE trg_tbl2_exists_tbl3_missing_or_vice_versa
TRIGGER ON tbl1 CHANGES EXECUTE func_tbl1
REFERENCES tbl2 CHANGES EXECUTE func_tbl2
REFERENCES tbl3 CHANGES EXECUTE func_tbl3;

FOR tbl1
DEPENDS ON tbl2, tbl3 VIA TRIGGER
trg_tbl2_exists_tbl3_missing_or_vice_versa

FOR tbl2
DEPENDED ON BY tbl1 VIA TRIGGER trg_tbl2_exists_tbl3_missing_or_vice_versa

FOR tbl3
DEPENDED ON BY tbl1 VIA TRIGGER trg_tbl2_exists_tbl3_missing_or_vice_versa

I suspect the possibility to enforce that trigger execution doesn't touch
tables other than those specified.

David J.

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

>
> I do get the "make \d show relevant information" argument and that is one
> that seems easier to solve...
>

Maybe I'm missing something, but I'm not sure how you'd solve this or
change what \d shows for a table.  Right now I get to see this in my \d:

"authorized_approvers_only" CHECK (approved_by IS NULL OR
has_perm(approved_by, 'APPROVE_PAYMENT'::character varying,
'W'::character
varying))

But when I move that to a trigger, I'll only see the trigger name.  Any
while this procedure would be really short, others not so much, so you
wouldn't really want to automatically display it inline.

Ken

-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ *
*https://agency-software.org/demo/client
*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
 to
learn more about AGENCY or
follow the discussion.

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

On Mon, Jun 5, 2017 at 5:59 PM, Ken Tanzer  wrote:

> I can't really make this an FK.  I can (and probably will) put this into a
>>> trigger.  Although it seems like an extra layer of wrapping just to call a
>>> function.  I'm curious if there's any conceptual reason why constraints
>>> couldn't (as an option) be restored after all the data is loaded, and
>>> whether there would be any negative consequences of that?  I could see if
>>> your data still didn't pass the CHECKs, it's already loaded.  But the
>>> constraint could then be marked not valid?
>>>
>>
>> Not sure why just know that if I stay within the guidelines it works, if
>> I do not its does not work:)
>>
>>
> That's fair enough, leaving aside the curiosity part.  Usually though the
> things you can't do just aren't allowed.  It's easier to overlook something
> that you shouldn't (but can) do!
>
>
I find in life most things that are prohibited are actually doable -
you're just punished if you get caught doing them.  In all seriousness
though I agree it would be nice if that's how this worked; but decades of
historical precedent makes actual preventive enforcement difficult if not
impossible.

Since "test your backups" covers this potential problem, and so many
possible others, any non-trivial effort to solve the actual problem is hard
to justify spending time on.

I do get the "make \d show relevant information" argument and that is one
that seems easier to solve, since adding explicit dependencies during
trigger creation would be a purely new feature.

David J.

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

>
> I can't really make this an FK.  I can (and probably will) put this into a
>> trigger.  Although it seems like an extra layer of wrapping just to call a
>> function.  I'm curious if there's any conceptual reason why constraints
>> couldn't (as an option) be restored after all the data is loaded, and
>> whether there would be any negative consequences of that?  I could see if
>> your data still didn't pass the CHECKs, it's already loaded.  But the
>> constraint could then be marked not valid?
>>
>
> Not sure why just know that if I stay within the guidelines it works, if I
> do not its does not work:)
>
>
That's fair enough, leaving aside the curiosity part.  Usually though the
things you can't do just aren't allowed.  It's easier to overlook something
that you shouldn't (but can) do!



> See that, but in your scenario you wanted to create a 'scratch' database
> so you are back to a user with privileges.


>
Yeah, I was thinking pg_dump could just conjure it up in the ether (and
then discard it), but I can see that doesn't really work.


Basically, if you have no way to test your backup/restore procedure before
> hand you are flying blind.
>
>
In this case, we had tested the restore part.  But then we changed the DB
in a way that made it stop working.  Good reminder to retest that
periodically!

Ken




-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ *
*https://agency-software.org/demo/client
*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
 to
learn more about AGENCY or
follow the discussion.

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

2017-06-05 Thread Adrian Klaver


On 06/05/2017 05:15 PM, Ken Tanzer wrote:
Thanks Adrian and David.  That all makes sense, and I gather the answer 
regarding the existing dumps is "no, they can't be restored."  So be 
it.  Here's a couple of follow-on comments::


Ideally figure out how to write an actual FK constraint - otherwise
use triggers.


I can't really make this an FK.  I can (and probably will) put this into 
a trigger.  Although it seems like an extra layer of wrapping just to 
call a function.  I'm curious if there's any conceptual reason why 
constraints couldn't (as an option) be restored after all the data is 
loaded, and whether there would be any negative consequences of that?  I 
could see if your data still didn't pass the CHECKs, it's already 
loaded.  But the constraint could then be marked not valid?


Not sure why just know that if I stay within the guidelines it works, if 
I do not its does not work:)





-1; pg_dump should not be trying to restore things.  The core
developers shouldn't really concern themselves with the various and
sundry ways people might want to setup such a process.  You have
tools for dump, and tools for restore, and you can combine them in
whatever fashion you deem useful.  Or otherwise acquire someone
else's ideas.


I get that as a general principle.  OTOH, being able to restore your 
backups isn't just a random or inconsequential feature.  I have access 
to the superuser and can create DBs, but users in more locked down 
scenarios might not be able to do so.




See that, but in your scenario you wanted to create a 'scratch' database 
so you are back to a user with privileges.  Then there is the whole 
overhead of doing a restore twice. Basically, if you have no way to test 
your backup/restore procedure before hand you are flying blind.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

2017-06-05 Thread John R Pierce


On 6/5/2017 5:49 PM, David G. Johnston wrote:
On Mon, Jun 5, 2017 at 5:40 PM, John R Pierce >wrote:


i
ndeed, any sort of constraint that invokes a function call which
looks at other tables could later be invalidated if those other
tables change, and postgres would be none the smarter.   the same
goes for trigger based checks.


 Yes.  I could imagine a new kind of "multi-referential trigger" that 
would specify all relations it touches and the function to fire when 
each of them is updated.  While you'd still have to write the 
functions correctly it would at least allow one to explicitly model 
the multi-table dynamic in pg_catalog.  Lacking that CHECK is no worse 
than TRIGGER and we've decided to say "use triggers".



at $job, the policy is, AVOID ALL TRIGGERS AND FANCY CONSTRAINTS :)

they don't even like using foreign key references, and rely on code 
logic to do most joins in the performance-critical OLTP side of things.



--
john r pierce, recycling bits in santa cruz

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

On Mon, Jun 5, 2017 at 5:40 PM, John R Pierce  wrote:

> i
> ndeed, any sort of constraint that invokes a function call which looks at
> other tables could later be invalidated if those other tables change, and
> postgres would be none the smarter.   the same goes for trigger based
> checks.
>

Yes.  I could imagine a new kind of "multi-referential trigger" that would
specify all relations it touches and the function to fire when each of them
is updated.  While you'd still have to write the functions correctly it
would at least allow one to explicitly model the multi-table dynamic in
pg_catalog.  Lacking that CHECK is no worse than TRIGGER and we've decided
to say "use triggers".

David J.

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

>
> Aside from being a bit more verbose there is nothing useful that writing
> this as "CHECK function()" provides that you don't also get by writing
> "CREATE TRIGGER".
>

I agree you get the same result.  It may be a minor issue, but for me it is
convenient to see the logic spelled out when using \d on the table.

Cheers,
Ken

-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ *
*https://agency-software.org/demo/client
*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
 to
learn more about AGENCY or
follow the discussion.

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

2017-06-05 Thread John R Pierce


On 6/5/2017 5:32 PM, David G. Johnston wrote:
On Mon, Jun 5, 2017 at 5:15 PM, Ken Tanzer >wrote:


From the docs:
https://www.postgresql.org/docs/9.6/static/sql-createtable.html

"Currently, CHECK expressions cannot contain subqueries nor
refer to variables other than columns of the current row. The
system column tableoid may be referenced, but not any other
system column.


I wonder if that should say "should not," or be followed by
something like this:


Make it say "must not" and I'd agree to change the word "cannot" and 
leave the rest.  Adding a note regarding functions seems appropriate.


Aside from being a bit more verbose there is nothing useful that 
writing this as "CHECK function()" provides that you don't also get by 
writing "CREATE TRIGGER". In a green field we'd probably lock down 
CHECK a bit more but there is too much code that is technically wrong 
but correctly functioning that we don't want to break.  IOW, we cannot 
mandate that the supplied function be immutable even though we 
should.  And we don't even enforce immutable execution if a function 
is defined that way.



indeed, any sort of constraint that invokes a function call which looks 
at other tables could later be invalidated if those other tables change, 
and postgres would be none the smarter.   the same goes for trigger 
based checks.




--
john r pierce, recycling bits in santa cruz

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

On Mon, Jun 5, 2017 at 5:15 PM, Ken Tanzer  wrote:

> From the docs:
>> https://www.postgresql.org/docs/9.6/static/sql-createtable.html
>> "Currently, CHECK expressions cannot contain subqueries nor refer to
>> variables other than columns of the current row. The system column tableoid
>> may be referenced, but not any other system column.
>
>
> I wonder if that should say "should not," or be followed by something like
> this:
>
>
Make it say "must not" and I'd agree to change the word "cannot" and leave
the rest.  Adding a note regarding functions seems appropriate.

Aside from being a bit more verbose there is nothing useful that writing
this as "CHECK function()" provides that you don't also get by writing
"CREATE TRIGGER". In a green field we'd probably lock down CHECK a bit more
but there is too much code that is technically wrong but correctly
functioning that we don't want to break.  IOW, we cannot mandate that the
supplied function be immutable even though we should.  And we don't even
enforce immutable execution if a function is defined that way.

David J.

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

2017-06-05 Thread John R Pierce


On 6/5/2017 5:15 PM, Ken Tanzer wrote:
I can't really make this an FK.  I can (and probably will) put this 
into a trigger.  Although it seems like an extra layer of wrapping 
just to call a function.  I'm curious if there's any conceptual reason 
why constraints couldn't (as an option) be restored after all the data 
is loaded, and whether there would be any negative consequences of 
that?  I could see if your data still didn't pass the CHECKs, it's 
already loaded.  But the constraint could then be marked not valid?



when you have constraints that rely on calling functions, how would it 
know what order to check things in ?



--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

Thanks Adrian and David.  That all makes sense, and I gather the answer
regarding the existing dumps is "no, they can't be restored."  So be it.
Here's a couple of follow-on comments::

Ideally figure out how to write an actual FK constraint - otherwise use
> triggers.


I can't really make this an FK.  I can (and probably will) put this into a
trigger.  Although it seems like an extra layer of wrapping just to call a
function.  I'm curious if there's any conceptual reason why constraints
couldn't (as an option) be restored after all the data is loaded, and
whether there would be any negative consequences of that?  I could see if
your data still didn't pass the CHECKs, it's already loaded.  But the
constraint could then be marked not valid?


-1; pg_dump should not be trying to restore things.  The core developers
> shouldn't really concern themselves with the various and sundry ways people
> might want to setup such a process.  You have tools for dump, and tools for
> restore, and you can combine them in whatever fashion you deem useful.  Or
> otherwise acquire someone else's ideas.


I get that as a general principle.  OTOH, being able to restore your
backups isn't just a random or inconsequential feature.  I have access to
the superuser and can create DBs, but users in more locked down scenarios
might not be able to do so.


>From the docs:
> https://www.postgresql.org/docs/9.6/static/sql-createtable.html
> "Currently, CHECK expressions cannot contain subqueries nor refer to
> variables other than columns of the current row. The system column tableoid
> may be referenced, but not any other system column.


I wonder if that should say "should not," or be followed by something like
this:

n.b., In CHECK expressions, Postgres will not prevent you from calling
functions that reference other rows or tables.  However, doing so may have
undesirable consequences, including the possible inability to restore from
output created by pg_dump.

(Are there other possible pitfalls too, or is that the only one?)

Cheers,
Ken


-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ *
*https://agency-software.org/demo/client
*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
 to
learn more about AGENCY or
follow the discussion.

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

2017-06-05 Thread Adrian Klaver


On 06/05/2017 03:35 PM, Ken Tanzer wrote:

On 9.3.17, I tried to restore a tar from pg_dump.  It failed thusly:

bash-4.1$ pg_restore -d spc_test_1 agency_backup.spc.2017.06.05_10.30.01.tar

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 10608; 0 107743 TABLE 
DATA tbl_payment spc
pg_restore: [archiver (db)] COPY failed for table "tbl_payment": ERROR: 
  new row for relation "tbl_payment" violates check constraint 
"authorized_approvers_only"
DETAIL:  Failing row contains (286541, 3685, 2015-09-14, ADJUST, null, 
null, 137798, 93.00, HONEY, 4841, 2, SHONCRE, September adjustment, 
2015-10-01, null, null, null, null, null, f, f, t, f, f, f, f, null, 
null, null, null, 6, 2015-09-14 16:43:37, 25, 2016-02-08 16:34:20, f, 
null, null, null, Adjusting approved_at to changed_at for first few 
approvals

, 6, 2015-09-14 16:43:37, 2015-09-17).
CONTEXT:  COPY tbl_payment, line 179785: "2865413685   
  2015-09-14  ADJUST  \N  \N  137798  93.00   HONEY   48412

 SHONCRE September adjustment2015-10-0..."
WARNING: errors ignored on restore: 1

The rest of the DB is fine, but tbl_payment has 0 rows.  I believe this 
is because tbl_payment has a constraint that calls a function has_perm() 
that relies on data in a couple of other tables, and that tbl_payment is 
being restored before those tables.  I was able to created a new dump in 
Custom format, reorder the List file, and restore that successfully.


See this thread for more info:
https://www.postgresql.org/message-id/alpine.DEB.2.20.1703311620581.12863%40tglase.lan.tarent.de

From the docs:

https://www.postgresql.org/docs/9.6/static/sql-createtable.html

"Currently, CHECK expressions cannot contain subqueries nor refer to 
variables other than columns of the current row. The system column 
tableoid may be referenced, but not any other system column.




So I can switch to Custom format for future backups.  But regarding the 
existing backups I have in Tar format, is there any way to successfully 
restore them?  Specifically:


  * Any way to ignore or delay constraint checking?  Something like
disable-triggers?

  * Any way to tell pg_restore to skip past the failing row, and restore
the rest of what was in tbl_payment?

  * Some other way to go about this?


Change the check constraint to a trigger.



I also wonder if you folks might consider adding something like a 
--test_restore option to pg_dump that would attempt to create a new 
(scratch) DB from the output it creates, and report any errors?  I know 


Not that I know of. It would be easy enough to point pg_restore at your 
own scratch database for testing purposes.


the pieces are all there for us users to do that ourselves, but it would 
be handy for automated backups and might help us to avoid creating 
backups that won't restore successfully.  In my case, I think the 
problem started from changes we made about 9 months ago, and happily I 
discovered it during development/testing and not after a DB crash, which 
is why I'm also happily not gouging my eyeballs out right now. :)


Cheers, and thanks in advance!

Ken


--
AGENCY Software
A Free Software data system
By and for non-profits
/http://agency-software.org//
/https://agency-software.org/demo/client/
ken.tan...@agency-software.org 


(253) 245-3801

Subscribe to the mailing list 
 to

learn more about AGENCY or
follow the discussion.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with restoring a dump in Tar format? (dependencies/ordering)

On Mon, Jun 5, 2017 at 3:35 PM, Ken Tanzer  wrote:

> I believe this is because tbl_payment has a constraint that calls a
> function has_perm() that relies on data in a couple of other tables
>

Indeed this is the cause.  That configuration is not supported.  If you
need to lookup values in other tables you either need to use an actual FK
constraint or create a trigger for the validation.


> So I can switch to Custom format for future backups.  But regarding the
> existing backups I have in Tar format, is there any way to successfully
> restore them?  Specifically:
>
>- Any way to ignore or delay constraint checking?  Something like
>disable-triggers?
>
> Using and then disabling triggers is the "closest" solution.

>
>- Any way to tell pg_restore to skip past the failing row, and restore
>the rest of what was in tbl_payment?
>
> No, COPY doesn't have that capability and that is what is being used
under the hood.

>
>- Some other way to go about this?
>
> Ideally figure out how to write an actual FK constraint - otherwise use
triggers.


> I also wonder if you folks might consider adding something like a
> --test_restore option to pg_dump
>

-1; pg_dump should not be trying to restore things.  The core developers
shouldn't really concern themselves with the various and sundry ways people
might want to setup such a process.  You have tools for dump, and tools for
restore, and you can combine them in whatever fashion you deem useful.  Or
otherwise acquire someone else's ideas.

David J.

Re: [GENERAL] Help with terminology to describe what my software does please?

2017-05-28 Thread Neil Anderson

>> Cluster comparison would only occur if you have two or more clusters on
>> the same server, although it's possible to compare across servers,
>
>
> Explain, because as I understand it a server = one cluster:
>

I think he was using server in the server=one machine sense, ie a
single machine/server can have multiple clusters/database servers.

> https://www.postgresql.org/docs/9.6/static/app-pg-ctl.html
>
> "The init or initdb mode creates a new PostgreSQL database cluster. A
> database cluster is a collection of databases that are managed by a single
> server instance. This mode invokes the initdb command. See initdb for
> details."
>
>> but that would involve a lot more work. AFAIK, the only differences for a
>> cluster would be:
>> 1. PostgreSQL version
>> 2. path to database
>> 3. database users (note: it is also possible to make users database
>> specific)
>> 4. list of defined databases
>
>
> And anything different below the above, I am thinking checking a dev cluster
> against a production cluster.
>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com

-- 
Neil Anderson
n...@postgrescompare.com
https://www.postgrescompare.com



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with terminology to describe what my software does please?

2017-05-28 Thread Neil Anderson

>
>
> Cluster comparison would only occur if you have two or more clusters on
> the same server, although it's possible to compare across servers,
> but that would involve a lot more work. AFAIK, the only differences for a
> cluster would be:
> 1. PostgreSQL version
> 2. path to database
> 3. database users (note: it is also possible to make users database
> specific)
> 4. list of defined databases
>

I was considering configuration settings to be at the cluster level too.
Stuff from pg_settings or pg_config. Also I think tablespaces are at that
level too. What do you think?


> Database comparison would involve db names, owners, encodings, tablespaces
> and acl's
> You might also want to include sizes. You can use the following two
> queries to help
> with that
>
> SELECT db.datname,
>au.rolname as datdba,
>pg_encoding_to_char(db.encoding) as encoding,
>db.datallowconn,
>db.datconnlimit,
>db.datfrozenxid,
>tb.spcname as tblspc,
>db.datacl
>   FROM pg_database db
>   JOIN pg_authid au ON au.oid = db.datdba
>   JOIN pg_tablespace tb ON tb.oid = db.dattablespace
>  ORDER BY 1;
>
> SELECT datname,
>pg_size_pretty(pg_database_size(datname))as size_pretty,
>pg_database_size(datname) as size,
>(SELECT pg_size_pretty (SUM( pg_database_size(datname))::bigint)
>   FROM pg_database)  AS total,
>((pg_database_size(datname) / (SELECT SUM(
> pg_database_size(datname))
>FROM pg_database) ) *
> 100)::numeric(6,3) AS pct
>   FROM pg_database
>   ORDER BY datname;
>

That's a great idea! Thanks for the info.


>
>

>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.
>



-- 
Neil Anderson
n...@postgrescompare.com
https://www.postgrescompare.com

Re: [GENERAL] Help with terminology to describe what my software does please?

2017-05-28 Thread Adrian Klaver


On 05/28/2017 07:53 AM, Melvin Davidson wrote:











Cluster comparison would only occur if you have two or more clusters on 
the same server, although it's possible to compare across servers,


Explain, because as I understand it a server = one cluster:

https://www.postgresql.org/docs/9.6/static/app-pg-ctl.html

"The init or initdb mode creates a new PostgreSQL database cluster. A 
database cluster is a collection of databases that are managed by a 
single server instance. This mode invokes the initdb command. See initdb 
for details."


but that would involve a lot more work. AFAIK, the only differences for 
a cluster would be:

1. PostgreSQL version
2. path to database
3. database users (note: it is also possible to make users database 
specific)

4. list of defined databases


And anything different below the above, I am thinking checking a dev 
cluster against a production cluster.




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with terminology to describe what my software does please?

2017-05-28 Thread Tom Lane

Neil Anderson  writes:
> I guess I don't know what is the most common way to say that it
> compares everything but the data. Any suggestions from your
> experience?

FWIW, I think it's pretty common to use "schema" in an abstract way
to mean "the structure of your database", ie everything but the data.
(It's unfortunate that the SQL standard commandeered the word to
mean a database namespace; but it's not like there are no other words
with more than one meaning.)

So I don't see any big problem with calling your tool a schema comparator.
You could maybe make your docs a bit clearer if you consistently refer
to the namespace objects as "SQL schemas", reserving the generic term
for the generic meaning.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with terminology to describe what my software does please?

2017-05-28 Thread Melvin Davidson

On Sun, May 28, 2017 at 9:51 AM, Adrian Klaver 
wrote:

> On 05/28/2017 05:49 AM, Neil Anderson wrote:
>
>> Hi,
>>
>> I'm working on a tool that can compare the properties of Postgres
>> objects from different instances, finding the differences and
>> outputting the update SQL.
>>
>> It can compare objects that are defined at the cluster, database or
>> schema level. As such I'm finding it difficult to describe what the
>> tool does simply and accurately. I've tried 'compares PostgreSQL
>> schemas' but that doesn't capture the database and cluster parts,
>> 'compares PostgreSQL schema and database objects'. That sort of thing.
>> Right now I have a mix of terms on my website and I would prefer to
>> tighten it up.
>>
>> I guess I don't know what is the most common way to say that it
>> compares everything but the data. Any suggestions from your
>> experience?
>>
>
> From above the first sentence of the second paragraph seems to me the best
> description of what you are doing.
>
>
>> Thanks,
>> Neil
>>
>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Cluster comparison would only occur if you have two or more clusters on the
same server, although it's possible to compare across servers,
but that would involve a lot more work. AFAIK, the only differences for a
cluster would be:
1. PostgreSQL version
2. path to database
3. database users (note: it is also possible to make users database
specific)
4. list of defined databases

Database comparison would involve db names, owners, encodings, tablespaces
and acl's
You might also want to include sizes. You can use the following two queries
to help
with that

SELECT db.datname,
   au.rolname as datdba,
   pg_encoding_to_char(db.encoding) as encoding,
   db.datallowconn,
   db.datconnlimit,
   db.datfrozenxid,
   tb.spcname as tblspc,
   db.datacl
  FROM pg_database db
  JOIN pg_authid au ON au.oid = db.datdba
  JOIN pg_tablespace tb ON tb.oid = db.dattablespace
 ORDER BY 1;

SELECT datname,
   pg_size_pretty(pg_database_size(datname))as size_pretty,
   pg_database_size(datname) as size,
   (SELECT pg_size_pretty (SUM( pg_database_size(datname))::bigint)
  FROM pg_database)  AS total,
   ((pg_database_size(datname) / (SELECT SUM(
pg_database_size(datname))
   FROM pg_database) ) *
100)::numeric(6,3) AS pct
  FROM pg_database
  ORDER BY datname;

 schema comparison is a lot more complication as it involves comparing
 collations
 domains
 functions
 trigger functions
 sequences
 tables
 types
 views

-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: [GENERAL] Help with terminology to describe what my software does please?

2017-05-28 Thread Adrian Klaver


On 05/28/2017 05:49 AM, Neil Anderson wrote:

Hi,

I'm working on a tool that can compare the properties of Postgres
objects from different instances, finding the differences and
outputting the update SQL.

It can compare objects that are defined at the cluster, database or
schema level. As such I'm finding it difficult to describe what the
tool does simply and accurately. I've tried 'compares PostgreSQL
schemas' but that doesn't capture the database and cluster parts,
'compares PostgreSQL schema and database objects'. That sort of thing.
Right now I have a mix of terms on my website and I would prefer to
tighten it up.

I guess I don't know what is the most common way to say that it
compares everything but the data. Any suggestions from your
experience?


From above the first sentence of the second paragraph seems to me the 
best description of what you are doing.




Thanks,
Neil





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-16 Thread Tom Lane

Magnus Hagander  writes:
> On Tue, May 16, 2017 at 10:00 AM, Devrim Gündüz  wrote:
>> Not sure whether we should *fix* this or not on RPM side. This may break
>> some of the existing installations, right?

> Changing that in a minor version seems like a *really* bad idea, because
> things *will* break. The way it is now it only breaks in case of a major
> version upgrade, and there is an easy enough workaround present.

Yeah, you don't have a lot of room in a minor release to make changes
that would affect this.

What Red Hat did about this, when I worked there, was to back-port the
unix_socket_directories patch from 9.3 into earlier branches, and then
set up the default server configuration to create sockets in both
/var/run/postgresql and /tmp.  But even if you did that, it'd require
an upgrade of the 9.2 installation before it would play nice with a
9.6 libpq, so that might be surprising.  (It would also break existing
9.2 installations that were explicitly setting unix_socket_directory,
but we can hope there are very few of those.)

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-16 Thread Adrian Klaver


On 05/16/2017 01:00 AM, Devrim Gündüz wrote:


Hi,

On Mon, 2017-05-15 at 22:35 -0700, Ken Tanzer wrote:

https://redmine.postgresql.org/issues/2409


Not sure whether we should *fix* this or not on RPM side. This may break some
of the existing installations, right?

I'm not objecting, just asking for opinions.


To me the principle of least surprise says that it should be fixed. At 
this point a pre-9.4 server is putting its socket where the primary 
client library(libpq) to said server cannot find it if a 9.4+ server is 
installed. The options seem to be:


1) Use the libpq appropriate for each Postgres version.

2) Modify the postgresql.conf to point at the socket directory that the 
controlling libpq is looking for. I could see this being messy.


3) Document the change in behavior. Possibly here:

https://www.postgresql.org/download/linux/redhat/

PostgreSQL Yum Repository



Regards,




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-16 Thread Magnus Hagander

On Tue, May 16, 2017 at 10:00 AM, Devrim Gündüz  wrote:

>
> Hi,
>
> On Mon, 2017-05-15 at 22:35 -0700, Ken Tanzer wrote:
> > https://redmine.postgresql.org/issues/2409
>
> Not sure whether we should *fix* this or not on RPM side. This may break
> some
> of the existing installations, right?
>
> I'm not objecting, just asking for opinions.
>
>
Changing that in a minor version seems like a *really* bad idea, because
things *will* break. The way it is now it only breaks in case of a major
version upgrade, and there is an easy enough workaround present.

But it should perhaps be more clearly documented somewhere.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-16 Thread Devrim Gündüz


Hi,

On Mon, 2017-05-15 at 22:35 -0700, Ken Tanzer wrote:
> https://redmine.postgresql.org/issues/2409

Not sure whether we should *fix* this or not on RPM side. This may break some
of the existing installations, right?

I'm not objecting, just asking for opinions.

Regards,
-- 
Devrim Gündüz
EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR


signature.asc
Description: This is a digitally signed message part

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-15 Thread Ken Tanzer

On Mon, May 15, 2017 at 4:45 PM, Adrian Klaver 
wrote:

> On 05/15/2017 01:40 PM, Ken Tanzer wrote:
>
>
>
>> But let me ask, is there a big warning about this somewhere I missed?
>> Can the 9.2 updates do something to fix this, or at least create a warning
>> or an RPMNEW file?  I'm happy this is a cloud server and that I worked on a
>> copy.  However, in different circumstances I might well have reasoned
>> "well, installing the 9.6 packages really should be safe for 9.2, since
>> they're clearly meant to exist side-by-side."  And then have a setup that
>> no longer worked as it once did.  With an RHEL clone and PGDG packages
>> straight from the horses mouth, I'd have higher expectations than that.
>> Only because of the great work y'all do! ;)
>>
>
> Might want to file an issue here:
>
> https://redmine.postgresql.org/projects/pgrpms/
>
> You will need a Postgres community account, which you can sign up for on
> the same page.
>
>
>>
Done, and thanks for pointing me to the tracker.

https://redmine.postgresql.org/issues/2409

Cheers,
Ken

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-15 Thread Adrian Klaver


On 05/15/2017 01:40 PM, Ken Tanzer wrote:




But let me ask, is there a big warning about this somewhere I missed?  
Can the 9.2 updates do something to fix this, or at least create a 
warning or an RPMNEW file?  I'm happy this is a cloud server and that I 
worked on a copy.  However, in different circumstances I might well have 
reasoned "well, installing the 9.6 packages really should be safe for 
9.2, since they're clearly meant to exist side-by-side."  And then have 
a setup that no longer worked as it once did.  With an RHEL clone and 
PGDG packages straight from the horses mouth, I'd have higher 
expectations than that.  Only because of the great work y'all do! ;)


Might want to file an issue here:

https://redmine.postgresql.org/projects/pgrpms/

You will need a Postgres community account, which you can sign up for on 
the same page.




Cheers,
Ken






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-15 Thread Devrim Gündüz


Hi,

On Mon, 2017-05-15 at 16:34 -0400, Tom Lane wrote:
> > bash-4.1$ /usr/pgsql-9.2/bin/psql -p 5432
> > psql: could not connect to server: Connection refused
> >    Is the server running locally and accepting
> >    connections on Unix domain socket
> > "/var/run/postgresql/.s.PGSQL.5432"?
> 
> The default is actually compiled into libpq.so, not psql itself.
> So I'm thinking what's happening here is the 9.2 psql is picking
> up a libpq.so supplied by 9.6.

Yeah, sorry, my bad. I forgot that the RPMs also put a file under
/etc/ld.so.conf.d, so that the latest libpq is picked up.

Regards,
-- 
Devrim Gündüz
EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR


signature.asc
Description: This is a digitally signed message part

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-15 Thread Ken Tanzer

>
>
>> Workarounds:
>>
>> * You can connect to 9.2 using /usr/pgsql-9.2/bin/psql command. It knows
>> the
>> old socket directory.
>>
>
> That was where I was going until I saw this in the OP:
>
> bash-4.1$ /usr/pgsql-9.2/bin/psql -p 5432
> psql: could not connect to server: Connection refused
> Is the server running locally and accepting
> connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.
> 5432"?
>
>
>
>> * Pass -h /tmp to 9.6's psql, so that it connects to 9.2 instance.
>>
>> -HTH
>>
>> Regards,
>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Thanks everyone for the replies.  Adrian is right--I did try this with the
9.2 binaries, with the same problem.  But to address Tom's question (and if
I'm using ldd properly), the 9.2 psql binary is using the 9.6 libpq.

[root@centos-new postgresql]# ldd /usr/bin/psql | grep libpq
libpq.so.5 => /usr/pgsql-9.6/lib/libpq.so.5 (0x7f2e6c99a000)
[root@centos-new postgresql]# ldd /usr/pgsql-9.2/bin/psql | grep libpq
libpq.so.5 => /usr/pgsql-9.6/lib/libpq.so.5 (0x7f52f9c67000)

Devrim--the -h /tmp option works great.

I still wanted this to just "work" though, for scripts and such.  I
specified the socket directory in the 9.2 postgresql.conf, and it seems to
be working "normally" now.

But let me ask, is there a big warning about this somewhere I missed?  Can
the 9.2 updates do something to fix this, or at least create a warning or
an RPMNEW file?  I'm happy this is a cloud server and that I worked on a
copy.  However, in different circumstances I might well have reasoned
"well, installing the 9.6 packages really should be safe for 9.2, since
they're clearly meant to exist side-by-side."  And then have a setup that
no longer worked as it once did.  With an RHEL clone and PGDG packages
straight from the horses mouth, I'd have higher expectations than that.
Only because of the great work y'all do! ;)

Cheers,
Ken



-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ *
*https://agency-software.org/demo/client
*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
 to
learn more about AGENCY or
follow the discussion.

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-15 Thread Tom Lane

Adrian Klaver  writes:
> On 05/15/2017 01:10 PM, Devrim Gündüz wrote:
>> * You can connect to 9.2 using /usr/pgsql-9.2/bin/psql command. It knows the
>> old socket directory.

> That was where I was going until I saw this in the OP:

> bash-4.1$ /usr/pgsql-9.2/bin/psql -p 5432
> psql: could not connect to server: Connection refused
>   Is the server running locally and accepting
>   connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

The default is actually compiled into libpq.so, not psql itself.
So I'm thinking what's happening here is the 9.2 psql is picking
up a libpq.so supplied by 9.6.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-15 Thread Adrian Klaver


On 05/15/2017 01:10 PM, Devrim Gündüz wrote:


Hi,

On Mon, 2017-05-15 at 12:55 -0700, Ken Tanzer wrote:

Hi.  On a Centos 6.9 server (in the cloud with Rackspace), I'm wanting to
install PGDG 9.6 alongside the already-running 9.2.  After installing the
9.6 packages (and even before doing an initdb), I am no
longer able to make a local connection to the 9.2 server.  Instead I get
the message:

psql: could not connect to server: Connection refused
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

That socket file does not exist on the server. (And in fact, the
/var/run/postgresql directory didn't exist before installing 9.6).  When I
configure 9.6 to use port 5433 and run it, it does create that socket for
5433.  I tried creating such a socket manually for 5432, but that didn't
seem to change anything.

Any help in getting this working and/or pointing out what I'm missing would
be great.  I'm also confused conceptually about what is happening here.
What is it that the installation (but not execution) of 9.6 does that's
blocking the local 9.2 access?  I'm guessing it's gotta be something in the
RPM install scripts.


PGDG RPMs use alternatives method, to replace some binaries that can be used
across multiple PostgreSQL versions, and psql is one of them. When you install
9.6, 9.6's psql has higher priority than 9.2, so that one is used -- and 9.4+
are complied with a patch that changes default socket directory from /tmp to
/var/run/postgresql, and 9.2 is not aware of that.


Workarounds:

* You can connect to 9.2 using /usr/pgsql-9.2/bin/psql command. It knows the
old socket directory.


That was where I was going until I saw this in the OP:

bash-4.1$ /usr/pgsql-9.2/bin/psql -p 5432
psql: could not connect to server: Connection refused
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?




* Pass -h /tmp to 9.6's psql, so that it connects to 9.2 instance.

-HTH

Regards,




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-15 Thread Devrim Gündüz


Hi,

On Mon, 2017-05-15 at 12:55 -0700, Ken Tanzer wrote:
> Hi.  On a Centos 6.9 server (in the cloud with Rackspace), I'm wanting to
> install PGDG 9.6 alongside the already-running 9.2.  After installing the
> 9.6 packages (and even before doing an initdb), I am no
> longer able to make a local connection to the 9.2 server.  Instead I get
> the message:
> 
> psql: could not connect to server: Connection refused
> Is the server running locally and accepting
> connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
> 
> That socket file does not exist on the server. (And in fact, the
> /var/run/postgresql directory didn't exist before installing 9.6).  When I
> configure 9.6 to use port 5433 and run it, it does create that socket for
> 5433.  I tried creating such a socket manually for 5432, but that didn't
> seem to change anything.
> 
> Any help in getting this working and/or pointing out what I'm missing would
> be great.  I'm also confused conceptually about what is happening here.
> What is it that the installation (but not execution) of 9.6 does that's
> blocking the local 9.2 access?  I'm guessing it's gotta be something in the
> RPM install scripts.

PGDG RPMs use alternatives method, to replace some binaries that can be used
across multiple PostgreSQL versions, and psql is one of them. When you install
9.6, 9.6's psql has higher priority than 9.2, so that one is used -- and 9.4+
are complied with a patch that changes default socket directory from /tmp to
/var/run/postgresql, and 9.2 is not aware of that.


Workarounds:

* You can connect to 9.2 using /usr/pgsql-9.2/bin/psql command. It knows the
old socket directory.

* Pass -h /tmp to 9.6's psql, so that it connects to 9.2 instance.

-HTH

Regards,

-- 
Devrim Gündüz
EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR


signature.asc
Description: This is a digitally signed message part

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-15 Thread Tom Lane

Ken Tanzer  writes:
> Hi.  On a Centos 6.9 server (in the cloud with Rackspace), I'm wanting to
> install PGDG 9.6 alongside the already-running 9.2.  After installing the
> 9.6 packages (and even before doing an initdb), I am no
> longer able to make a local connection to the 9.2 server.  Instead I get
> the message:

> psql: could not connect to server: Connection refused
> Is the server running locally and accepting
> connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

Where is the 9.2 server making its socket ... /tmp ?

What it looks like is that you've started to use a libpq.so that is
following the Red Hat convention of putting the socket file in
/var/run/postgresql, rather than /tmp.  I do not know exactly where
the PGDG packages stand on that theological issue, or whether they
changed between 9.2 and 9.6.  But the first step would be to use
"ldd" to see which libpq your invoked psql is pulling in.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help: Installing 9.6 breaks local connections to 9.2 on Centos 6.9

2017-05-15 Thread Justin Pryzby

On Mon, May 15, 2017 at 12:55:48PM -0700, Ken Tanzer wrote:
> Hi.  On a Centos 6.9 server (in the cloud with Rackspace), I'm wanting to
> install PGDG 9.6 alongside the already-running 9.2.  After installing the
> 9.6 packages (and even before doing an initdb), I am no
> longer able to make a local connection to the 9.2 server.  Instead I get
> the message:

See eg.
https://www.postgresql.org/message-id/21044.1326496...@sss.pgh.pa.us
https://www.postgresql.org/message-id/0a21bc93-7b9c-476e-aaf4-0ff71708e...@elevated-dev.com

I'm guessing you upgraded the client libraries, which probably change the
(default) socket path.

Your options are to specify path to the socket (maybe in /tmp for running
PG92?), change to TCP connection, or specify server option
unix_socket_directories.

Justin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with Trigger

2016-12-28 Thread Clifford Snow

Thank you for your suggestion which solved the problem. Much better
solution that what I was trying to accomplish. Much smaller table to query
since it only has one entry per user.

Clifford

On Wed, Dec 28, 2016 at 8:12 PM, Adrian Klaver 
wrote:

> On 12/28/2016 07:06 PM, Clifford Snow wrote:
>
>> I'm trying to write a trigger (my first) to update another table if the
>> user_id is new. But I'm getting a index exception that the user_id
>>
>
> What is the actual error message?
>
> already exists. I'm picking up data from another feed which gives
>> provides me with changes to the main database.
>>
>> what I have is
>>
>> CREATE OR REPLACE FUNCTION add_new_user()
>> RETURNS TRIGGER AS
>> $BODY$
>> DECLARE
>> commits RECORD;
>> BEGIN
>> SELECT INTO commits * FROM changes WHERE user_id = NEW.user_id;
>>
>
> In the above you are checking whether the changes table has the user_id
> and if does not then creating a new user in the user table below. Not sure
> how they are related, but from the description of the error it would seem
> they are not that tightly coupled. In other words just because the user_id
> does not exist in changes does not ensure it also absent from the table
> user. Off the top of head I would say the below might be a better query:
>
> SELECT INTO commits * FROM user WHERE user_id = NEW.user_id;
>
> Though it would help the debugging process if you showed the complete
> schema for both the changes and user tables.
>
>
> IF NOT FOUND
>> THEN
>> INSERT INTO user (user_name, user_id, change_id,
>> created_date)
>> VALUES(NEW.user_name, NEW.user_id,
>> NEW.change_id, NEW.created_date);
>> END IF;
>> RETURN NEW;
>> END;
>> $BODY$
>> LANGUAGE plpgsql;
>>
>> CREATE TRIGGER add_new_user_trigger
>> BEFORE INSERT ON changes
>> FOR EACH ROW
>> EXECUTE PROCEDURE add_new_user();
>>
>> I hoping for some recommendations on how to fix or at where I'm going
>> wrong.
>>
>> Thanks,
>> Clifford
>>
>>
>> --
>> @osm_seattle
>> osm_seattle.snowandsnow.us 
>> OpenStreetMap: Maps with a human touch
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>



-- 
@osm_seattle
osm_seattle.snowandsnow.us
OpenStreetMap: Maps with a human touch

Re: [GENERAL] Help with Trigger

2016-12-28 Thread Adrian Klaver


On 12/28/2016 07:06 PM, Clifford Snow wrote:

I'm trying to write a trigger (my first) to update another table if the
user_id is new. But I'm getting a index exception that the user_id


What is the actual error message?


already exists. I'm picking up data from another feed which gives
provides me with changes to the main database.

what I have is

CREATE OR REPLACE FUNCTION add_new_user()
RETURNS TRIGGER AS
$BODY$
DECLARE
commits RECORD;
BEGIN
SELECT INTO commits * FROM changes WHERE user_id = NEW.user_id;


In the above you are checking whether the changes table has the user_id 
and if does not then creating a new user in the user table below. Not 
sure how they are related, but from the description of the error it 
would seem they are not that tightly coupled. In other words just 
because the user_id does not exist in changes does not ensure it also 
absent from the table user. Off the top of head I would say the below 
might be a better query:


SELECT INTO commits * FROM user WHERE user_id = NEW.user_id;

Though it would help the debugging process if you showed the complete 
schema for both the changes and user tables.




IF NOT FOUND
THEN
INSERT INTO user (user_name, user_id, change_id,
created_date)
VALUES(NEW.user_name, NEW.user_id,
NEW.change_id, NEW.created_date);
END IF;
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql;

CREATE TRIGGER add_new_user_trigger
BEFORE INSERT ON changes
FOR EACH ROW
EXECUTE PROCEDURE add_new_user();

I hoping for some recommendations on how to fix or at where I'm going wrong.

Thanks,
Clifford


--
@osm_seattle
osm_seattle.snowandsnow.us 
OpenStreetMap: Maps with a human touch



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] help with moving tablespace

2016-11-17 Thread rob stone


> Bonus question: I found an ER diagram of some of the pg_* tables at h
> ttp://www.slideshare.net/oddbjorn/Get-to-know-PostgreSQL. Is there an
> ERD of all of them so a person can better understand how to use them
> when one must? I suppose the same question applies to
> information_schema since I probably should be using that over the
> pg_* tables when possible (and as the above example shows, sometimes
> you have to go look at the pg_* tables).
> 
> Thanks!
> Kevin
> 
> 

Hello,

ExecuteQuery has an ER diagram tool. You can download the jar file from
www.executequery.org and obtain the JDBC driver from the Postgres site.
You set up separate connections to all databases that you wish to
access.
It generates the ER diagram but prior to printing it you need to drag
and drop the "boxes" around to make it readable. I have not tried it
(yet) over information_schema.

HTH,
Rob


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] help with moving tablespace

2016-11-17 Thread

> On Thu, Nov 17, 2016  wrote:
> > On Thu, Nov 17, 2016 at 9:16 AM,  wrote:
> > First, the above works only *most* of the time in our testing on multiple 
> > servers. When it fails, it's because not everything was moved out of the 
> > old tablespace and I don't understand why. An "ls $PGDATA/ourdb/PG*/" shows 
> > files are still present. According to some searching, I should be able to 
> > do:
> 
>  
> Likely more than one database in the cluster is using $PGDATA/ourdb as its 
> default tablespace location so you need to alter all of them.

Sigh, it's so easy to overlook the obvious; thanks for pointing that out. 
Knowing what to look for and with some research, doing:

select datname,dattablespace,spcname from pg_database join pg_tablespace on 
dattablespace = pg_tablespace.oid;

shows there is indeed an extra schema using that tablespace that I'll need to 
drop or move. Hopefully that helps someone else.


> pg_class displays relative to the current database only so you need to log 
> into the others to check them.

Right, something else I didn't consider.


> > Second, the "ALTER DATABASE ourdb SET TABLESPACE new_ts" which does the 
> > move is slow even on our smaller test DBs, almost as if it is having to 
> > dump and reload (or more likely copy) the data. This raises the concern of 
> > how long this is going to take on our bigger DBs. Is there a faster way to 
> > accomplish the same thing especially since the new and old tablespaces are 
> > on the same disk partition?
> >
> > For example, from what I can see the data is sitting in a dir and there is 
> > a symlink to it in $PGDATA/pg_tblspc.
> >
> > Could I shut down PG, move the DB dir, recreate the symlink in pg_tblspc, 
> > then restart PG and all would be well in only a few seconds?
> 
> 
> I think this would work - all the SQL commands do is invoke O/S commands on 
> your behalf and I'm reasonably certain this is what they end up doing.  Given 
> that you are indeed testing you should try this and make sure.  Its either 
> going to work, or not, I don't foresee (in my limited experience...) any 
> delayed reaction that would be likely to arise.


Thanks! That gives me confidence to give that method a try.

Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] help with moving tablespace

2016-11-17 Thread David G. Johnston

On Thu, Nov 17, 2016 at 9:16 AM,  wrote:

> First, the above works only *most* of the time in our testing on multiple
> servers. When it fails, it's because not everything was moved out of the
> old tablespace and I don't understand why. An "ls $PGDATA/ourdb/PG*/" shows
> files are still present. According to some searching, I should be able to
> do:
>

Likely more than one database in the cluster is using $PGDATA/ourdb as its
default tablespace location so you need to alter all of them.

> SELECT c.relname, t.spcname
> FROM   pg_class c JOIN pg_tablespace t ON c.reltablespace = t.oid
> WHERE  t.spcname = 'old_name';
>
> But that always returns 0 rows. So how do I track this down?
>

pg_class displays relative to the current database only so you need to log
into the others to check them.

> Second, the "ALTER DATABASE ourdb SET TABLESPACE new_ts" which does the
> move is slow even on our smaller test DBs, almost as if it is having to
> dump and reload (or more likely copy) the data. This raises the concern of
> how long this is going to take on our bigger DBs. Is there a faster way to
> accomplish the same thing especially since the new and old tablespaces are
> on the same disk partition?
>
> For example, from what I can see the data is sitting in a dir and there is
> a symlink to it in $PGDATA/pg_tblspc.

> Could I shut down PG, move the DB dir, recreate the symlink in pg_tblspc,
> then restart PG and all would be well in only a few seconds?
>

I think this would work - all the SQL commands do is invoke O/S commands
on your behalf and I'm reasonably certain this is what they end up doing.
Given that you are indeed testing you should try this and make sure.  Its
either going to work, or not, I don't foresee (in my limited experience...)
any delayed reaction that would be likely to arise.

David J.

Re: [GENERAL] Help with slow query - Pgsql 9.2

2016-09-06 Thread Jeff Janes

On Mon, Sep 5, 2016 at 6:53 PM, Patrick B  wrote:

> Hi guys,
>
> I got this query:
>
>> SELECT id,jobid,description,serialised_data
>> FROM logtable
>> WHERE log_type = 45
>> AND clientid = 24011
>> ORDER BY gtime desc
>
>

What is really going to help you here is multicolumn index on (clientid,
log_type), or (log_type, clientid).

It will not cost you much, because you can get rid of whichever
single-column index is on the column you list first in your multi-column
index.

>
>
> Explain analyze: https://explain.depesz.com/s/XKtU
>
> So it seems the very slow part is into:
>
>   ->  Bitmap Index Scan on "ix_client"  (cost=0.00..5517.96
>> rows=367593 width=0) (actual time=2668.246..2668.246 rows=356327 loops=1)
>> Index Cond: ("clientid" = 24011)
>
>
> Am I right? The query is already using an index on that table... how could
> I improve the performance in a query that is already using an index?
>

Right, that is the slow step.  Probably the index is not already in memory
and had to be read from disk, slowly.  You could turn track_io_timing on
and then run explain (analyze, buffers) to see if that is the case.  But
once you build a multi-column index, it shouldn't really matter anymore.

Cheers,

Jeff

Re: [GENERAL] Help on recovering my standby

2016-06-22 Thread Patrick B

I had the same issue... A slave server had missing wal_files... and it
wasn't synced.

I had to re-sync all the DB, by running the pg_basebackup command

So.. basically, what I did is:

1 - Ensure that the wal_files are being inserted into the slave
2 - Backup the recovery.conf, postgresql.conf and pg_hba.conf
3 - Delete all the current data folder, by doing: rm -rf
/var/lib/pgsql/9.2/data/*
4 - Running the pg_basebackup command to re-sync the DB from another slave
to the slave that I wanna fix
5 - Replace the .conf backup files into the new data folder
6 - Start postgres

And it worked nice

Patrick

Re: [GENERAL] Help on recovering my standby

2016-06-22 Thread Melvin Davidson

On Wed, Jun 22, 2016 at 12:22 PM, Alan Hodgson 
wrote:

> On Tuesday 21 June 2016 19:34:18 Ramalingam, Sankarakumar wrote:
> > Hi I have my standby (streaming replication) down due to missing wal
> files.
> > You would see the same error in the logs stating "cannot find the wal
> file
> > ..." What is the best way to get it going so that when we switch between
> > standby and primary once in a while they are in sync?
> >
> > Currently I am working on a CERT server and hence there is no outage
> > concerns. I need to repeat the same process on prod once I get it going
> > successfully. Any help is appreciated.
> >
>
> You should keep your WAL files from the master for at least as long as the
> slave might be offline (plus startup time), somewhere the slave can copy
> them
> from when needed (shared file system, object store, scp target, whatever).
>
> See the postgresql.conf parameter archive_command and the corresponding
> recovery.conf parameter restore_command.
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

It would be really helpful if you included PostgreSQL version and O/S in
your problem description, but since you have not, I will give a "generic"
fix.

It is doubtful, but you can check the pg_xlog on the master for the
"missing" WAL files and if they are there, simply rsync them to the standby.
If you are truly missing WAL files in your slave/standy, then  you need to
rebuild the slave as per standard procedures.
Make sure you change wal_keep_segments value on the master to be
sufficiently highly so that the problem does not occur again.
Once you make the change, be sure to reload the config file on the master
Either
SELECT pg_reload_conf();
or
pg_ctl reload -D your_data_dir

-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: [GENERAL] Help on recovering my standby

2016-06-22 Thread Alan Hodgson

On Tuesday 21 June 2016 19:34:18 Ramalingam, Sankarakumar wrote:
> Hi I have my standby (streaming replication) down due to missing wal files.
> You would see the same error in the logs stating "cannot find the wal file
> ..." What is the best way to get it going so that when we switch between
> standby and primary once in a while they are in sync?
> 
> Currently I am working on a CERT server and hence there is no outage
> concerns. I need to repeat the same process on prod once I get it going
> successfully. Any help is appreciated.
> 

You should keep your WAL files from the master for at least as long as the 
slave might be offline (plus startup time), somewhere the slave can copy them 
from when needed (shared file system, object store, scp target, whatever).

See the postgresql.conf parameter archive_command and the corresponding  
recovery.conf parameter restore_command.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help needed structuring Postgresql correlation query

2016-06-21 Thread Tim Smith

Thanks for that, looks like something to sink my teeth into !

On 21 June 2016 at 13:29, Alban Hertroys  wrote:
>
>> On 19 Jun 2016, at 10:58, Tim Smith  wrote:
>>
>> Hi,
>>
>> My postgresql-fu is not good enough to write a query to achieve this
>> (some may well say r is a better suited tool to achieve this !).
>>
>> I need to calculate what I would call a correlation window on a time
>> series of data, my table looks like this :
>>
>> create table data(data_date date,data_measurement numeric);
>> insert into data values('2016-01-01',16.23);
>> 
>> insert into data values('2016-06-19',30.54);
>>
>> My "target sample" would be the N most recent samples in the table
>> (e.g. 20, the most recent 20 days)
>>
>> My "potential sample" would be a moving window of size N (the same
>> size N as above), starting at T0 (i.e. 2016-01-01 in this example) and
>> incrementing by one (i.e. 2016-01-01 + 20, then 2016-01-02+20 etc),
>> but the "target sample" would obviously be excluded.
>>
>> The output needs to display window date range (or at least the start
>> date of the "potential sample" window) and the result
>> corr(target,potential).
>>
>> Hope that makes sense
>
> Something like this could do the trick (untested):
>
> with recursive sample (nr, start_date) as (
> select 1 as nr, data_date as start_date, 
> SUM(data_measurement) as total
> from generate_series(0, 19) range(step)
> left join data on (data_date = start_date + range.step)
>
> union all
>
> select nr + 1, sample.start_date +1, SUM(data_measurement) as 
> total
> from sample
> join generate_series(0, 19) range(step)
> left join data on (data_date = start_date +1 + range.step)
> where start_date +1 +19 <= (select MAX(data_date) from data)
> group by 1, 2
> )
> select * from sample where start_date >= '2016-01-01';
>
> Not sure how best to go about parameterising sample size N, a stored function 
> seems like a good option.
>
>
> Another approach would be to move a (cumulative) window-function with 20 
> items over your data set and for each row subtract the first value of the 
> previous window from the total of the current window (that is, assuming 
> you're calculating a SUM of data_measurement for each window of 20 records).
>
> Visually that looks something like this for sample size 4:
> sample 1: (A + B + C + D)
> sample 2: (A + B + C + D) + E - A = (B + C + D + E)
> sample 3: (B + C + D + E) + F - B = (C + D + E + F)
> etc.
>
> To accomplish this, you calculate two cumulative totals (often misnamed as 
> running totals, but AFAIK that's something different), one from the start, 
> and one lagging N rows behind (you can use the lag() window function for 
> that) and subtract the two.
>
> Good luck!
>
> Alban Hertroys
> --
> If you can't see the forest for the trees,
> cut the trees and you'll find there is no forest.
>


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with namespaces in xpath (PostgreSQL 9.5.3)

2016-06-21 Thread Allan Kamau

Thank you David.

-Allan.

On Mon, Jun 20, 2016 at 11:19 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Sun, Jun 19, 2016 at 5:09 PM, Allan Kamau  wrote:
>
>> I have an xml document from which I would like to extract the contents of
>> several elements.
>>
>> I would like to use xpath to extract the contents of "name" from the xml
>> document shown below.
>>
>> WITH x AS
>> (
>> SELECT
>> '
>> http://uniprot.org/uniprot; xmlns:xsi="
>> http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation="
>> http://uniprot.org/uniprot
>> http://www.uniprot.org/support/docs/uniprot.xsd;>
>> > version="56">
>> A0JM59
>> UBP20_XENTR
>> 
>> 
>> '::xml AS d
>> )
>> SELECT (xpath('/uniprot/entry/name/text()',a.d))[1]::text AS uniprot_name
>> FROM
>> x AS a
>> ;
>>
>> The documentation for xpath() ("
>> https://www.postgresql.org/docs/9.5/static/functions-xml.html;)
>> describes "xpath(xpath, xml [, nsarray]").
>>
>> For the above xml document, what would be the two dimensional array
>> "nsarray" for the xpath() function?
>>
>
> Is there a specific part of the description and two examples that doesn't
> make sense to you?
>
> Or more specifically, do you understand what namespaces are?
>
> ARRAY[
> ARRAY['defaultns','http://uniprot.org/uniprot'],
> ARRAY['xsi','http://www.w3.org/2001/XMLSchema-instance']
> ]
>
> In effect when the xpath function parses the XML document it tosses away
> all of the document-local namespace aliases and instead associated the full
> namespace URI with each element (in the DOM).  Since, in the xpath
> expression, usually you'd want to refer to nodes in the DOM via their
> namespace alias you need to tell the xpath function which aliases you
> intend to use in the xpath and which full URI they correspond to.
> Furthermore, there is not concept of a default namespace in the xpath
> expression.  So while you can simply copy-paste the aliases and URIs from
> all of the non-default namespace aliases you must also choose a unique
> alias for the default namespace in the original document.
>
> In the above I've copied the alias and namespace URI for the named "xsi"
> alias and gave the name "defaultns" to the original document's default
> namespace URI.
>
> David J.
>
>

Re: [GENERAL] Help needed structuring Postgresql correlation query

2016-06-21 Thread Alban Hertroys


> On 19 Jun 2016, at 10:58, Tim Smith  wrote:
> 
> Hi,
> 
> My postgresql-fu is not good enough to write a query to achieve this
> (some may well say r is a better suited tool to achieve this !).
> 
> I need to calculate what I would call a correlation window on a time
> series of data, my table looks like this :
> 
> create table data(data_date date,data_measurement numeric);
> insert into data values('2016-01-01',16.23);
> 
> insert into data values('2016-06-19',30.54);
> 
> My "target sample" would be the N most recent samples in the table
> (e.g. 20, the most recent 20 days)
> 
> My "potential sample" would be a moving window of size N (the same
> size N as above), starting at T0 (i.e. 2016-01-01 in this example) and
> incrementing by one (i.e. 2016-01-01 + 20, then 2016-01-02+20 etc),
> but the "target sample" would obviously be excluded.
> 
> The output needs to display window date range (or at least the start
> date of the "potential sample" window) and the result
> corr(target,potential).
> 
> Hope that makes sense

Something like this could do the trick (untested):

with recursive sample (nr, start_date) as (
select 1 as nr, data_date as start_date, SUM(data_measurement) 
as total
from generate_series(0, 19) range(step)
left join data on (data_date = start_date + range.step)

union all

select nr + 1, sample.start_date +1, SUM(data_measurement) as 
total
from sample
join generate_series(0, 19) range(step)
left join data on (data_date = start_date +1 + range.step)
where start_date +1 +19 <= (select MAX(data_date) from data)
group by 1, 2
)
select * from sample where start_date >= '2016-01-01';

Not sure how best to go about parameterising sample size N, a stored function 
seems like a good option.


Another approach would be to move a (cumulative) window-function with 20 items 
over your data set and for each row subtract the first value of the previous 
window from the total of the current window (that is, assuming you're 
calculating a SUM of data_measurement for each window of 20 records).

Visually that looks something like this for sample size 4:
sample 1: (A + B + C + D)
sample 2: (A + B + C + D) + E - A = (B + C + D + E)
sample 3: (B + C + D + E) + F - B = (C + D + E + F)
etc.

To accomplish this, you calculate two cumulative totals (often misnamed as 
running totals, but AFAIK that's something different), one from the start, and 
one lagging N rows behind (you can use the lag() window function for that) and 
subtract the two.

Good luck!

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with namespaces in xpath (PostgreSQL 9.5.3)

2016-06-20 Thread David G. Johnston

On Sun, Jun 19, 2016 at 5:09 PM, Allan Kamau  wrote:

> I have an xml document from which I would like to extract the contents of
> several elements.
>
> I would like to use xpath to extract the contents of "name" from the xml
> document shown below.
>
> WITH x AS
> (
> SELECT
> '
> http://uniprot.org/uniprot; xmlns:xsi="
> http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation="
> http://uniprot.org/uniprot http://www.uniprot.org/support/docs/uniprot.xsd
> ">
>  version="56">
> A0JM59
> UBP20_XENTR
> 
> 
> '::xml AS d
> )
> SELECT (xpath('/uniprot/entry/name/text()',a.d))[1]::text AS uniprot_name
> FROM
> x AS a
> ;
>
> The documentation for xpath() ("
> https://www.postgresql.org/docs/9.5/static/functions-xml.html;) describes
> "xpath(xpath, xml [, nsarray]").
>
> For the above xml document, what would be the two dimensional array
> "nsarray" for the xpath() function?
>

Is there a specific part of the description and two examples that doesn't
make sense to you?

Or more specifically, do you understand what namespaces are?

ARRAY[
ARRAY['defaultns','http://uniprot.org/uniprot'],
ARRAY['xsi','http://www.w3.org/2001/XMLSchema-instance']
]

In effect when the xpath function parses the XML document it tosses away
all of the document-local namespace aliases and instead associated the full
namespace URI with each element (in the DOM).  Since, in the xpath
expression, usually you'd want to refer to nodes in the DOM via their
namespace alias you need to tell the xpath function which aliases you
intend to use in the xpath and which full URI they correspond to.
Furthermore, there is not concept of a default namespace in the xpath
expression.  So while you can simply copy-paste the aliases and URIs from
all of the non-default namespace aliases you must also choose a unique
alias for the default namespace in the original document.

In the above I've copied the alias and namespace URI for the named "xsi"
alias and gave the name "defaultns" to the original document's default
namespace URI.

David J.

Re: [GENERAL] HELP! Uninstalled wrong version of postgres

2016-03-25 Thread Leonardo M . Ramé


El 24/03/16 a las 14:19, Howard News escribió:

Hi,

I uninstalled the wrong version of postgres on Ubuntu using apt-get
remove postgresql-9.0, convinced that this was an old unused version.
You guess the rest...

The data files still appear to be there, all 485GB of them. Can these be
restored?

Thanks.



Ok, if the data files are still there I'd do this:

1) Assuming the data is in /var/lib/postgresql/9.0, rename that 
directory to /var/lib/9.0-old, AND COPY THAT DIRECTORY ELSEWHERE.
2) Reinstall 9.0 with "apt-get install postgresql-9.0". This should 
re-create the /var/lib/9.0 directory with an empty "main" dir.

3) Stop 9.0 with "pg_ctlcluster 9.0 main stop".
4) Rename the new directory /var/lib/9.0 to /var/lib/9.0-new
5) Rename the old dir (/var/lib/9.0-old) to /var/lib/9.0
6) Restart the cluster with "pg_ctlcluster 9.0 main start".

And everything should be fine again.

P.S.: All those steps should be done as root.

Regards,
--
Leonardo M. Ramé
http://leonardorame.blogspot.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-14 Thread Jim Nasby


On 12/9/15 7:05 PM, Andreas Kretschmer wrote:

I'm really newbie to PostgreSQL but the boss pushed me to handle it
>and implement it in production f*&%*$%%$#%$#

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-09 Thread Alan Hodgson

On Wednesday, December 09, 2015 07:55:09 AM FattahRozzaq wrote:
> archive_mode = on
> archive_command = 'cp -i %p /home/postgres/archive/master/%f'
> 
> 
> The WAL archive folder is at /home/postgres/archive/master/, right?
> This directory consumes around 750GB of Disk-1.
> Each segment in the /home/postgres/archive/master/ is 16MB each
> There are currently 47443 files in this folder.
> 
> If I want to limit the total size use by WAL archive to around 200-400
> GB, what value should I set for the wal_keep_segments,
> checkpoint_segments?

PostgreSQL doesn't clean up files copied by your archive_command. You need to 
have a separate task clean those out. PostgreSQL's active wal_keep_segments 
etc. are in the data/pg_xlog directory.

signature.asc
Description: This is a digitally signed message part.

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-09 Thread Adrian Klaver


On 12/09/2015 11:15 AM, Alan Hodgson wrote:

On Wednesday, December 09, 2015 07:55:09 AM FattahRozzaq wrote:

archive_mode = on
archive_command = 'cp -i %p /home/postgres/archive/master/%f'


The WAL archive folder is at /home/postgres/archive/master/, right?
This directory consumes around 750GB of Disk-1.
Each segment in the /home/postgres/archive/master/ is 16MB each
There are currently 47443 files in this folder.

If I want to limit the total size use by WAL archive to around 200-400
GB, what value should I set for the wal_keep_segments,
checkpoint_segments?


PostgreSQL doesn't clean up files copied by your archive_command. You need to
have a separate task clean those out. PostgreSQL's active wal_keep_segments
etc. are in the data/pg_xlog directory.



The OP might want to take a look at:

http://www.postgresql.org/docs/9.4/interactive/pgarchivecleanup.html

To be safe I would use:

-n

Print the names of the files that would have been removed on stdout 
(performs a dry run).



at first.

--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-09 Thread Joshua D. Drake


On 12/09/2015 04:38 PM, FattahRozzaq wrote:

Quick information,

After I realize, the line "archive_command=/bin/true" is a bad
decision, I have revert it back.
Now I'm really confused and panic.
I don't know what to do, and I don't really understand the postgresql.conf
I'm a network engineer, I should handle the network and also
postgresql database.
Oh man, the office is so good but this part is sucks :((


If the pg_xlog directory is growing it is likely that either:

* wal_keep_segments is set high and your slave is not correctly 
receiving updates.


* You are using a replication slot and the slave is not correctly 
receiving updates.


If your archive_command does not return a success, your pg_xlog will 
also grow but you don't need the archive_command *IF* your streaming 
replication is working *UNLESS* you are also doing archiving or PITR.


Sincerely,

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-09 Thread FattahRozzaq

Hi John,

I really don't know why I should keep the wal archives.
I implement streaming replication into 1 server (standby server).
I'm really newbie to PostgreSQL but the boss pushed me to handle it
and implement it in production

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-09 Thread Adrian Klaver


On 12/09/2015 04:27 PM, FattahRozzaq wrote:

Hi John,

I really don't know why I should keep the wal archives.


So who set up the archiving and why?

Is archive recovery set up on the standby?:

http://www.postgresql.org/docs/9.4/interactive/archive-recovery-settings.html


I implement streaming replication into 1 server (standby server).


Is that the only standby or is there another set up previously?

Per another recent thread having a WAL archive to fall back on is handy 
if the streaming replication falls behind and wal_keep_segments is not 
high enough:


http://www.postgresql.org/docs/9.4/interactive/warm-standby.html#STREAMING-REPLICATION

"If you use streaming replication without file-based continuous 
archiving, the server might recycle old WAL segments before the standby 
has received them. If this occurs, the standby will need to be 
reinitialized from a new base backup. You can avoid this by setting 
wal_keep_segments to a value large enough to ensure that WAL segments 
are not recycled too early, or by configuring a replication slot for the 
standby. If you set up a WAL archive that's accessible from the standby, 
these solutions are not required, since the standby can always use the 
archive to catch up provided it retains enough segments."



I'm really newbie to PostgreSQL but the boss pushed me to handle it
and implement it in production f*&%*$%%$#%$#

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-09 Thread Andreas Kretschmer



> FattahRozzaq  hat am 10. Dezember 2015 um 01:27
> geschrieben:
> 
> 
> Hi John,
> 
> I really don't know why I should keep the wal archives.


That's the problem! But that's your part, not our. If you need a Backup with
PITR-capability you have to create a so called basebackup and continously WAL's.
If you create later, say the next day, a new Basebackup and your Backup-Policy
is hold one Backup, than you can delete all WAL's untill to the new Basebackup
and the old Backup.

If i where you i would use somethink like barman (see: http://www.pgbarman.org/
) for that. And yes: you should a extra Backup-Server. If you have both
(Database and Backup) on the same machine and the machine burns you will lost
both, data and backup.


Questions?



> I implement streaming replication into 1 server (standby server).


Streamin Replication can't replace a Backup!


> I'm really newbie to PostgreSQL but the boss pushed me to handle it
> and implement it in production

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-09 Thread John R Pierce


On 12/9/2015 4:27 PM, FattahRozzaq wrote:

I really don't know why I should keep the wal archives.
I implement streaming replication into 1 server (standby server).
I'm really newbie to PostgreSQL but the boss pushed me to handle it
and implement it in production f*&%*$%%$#%$#

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-09 Thread FattahRozzaq

Hi John,

Really thanking you for spend time typing and responding my email.
I think the archive_command returns success, I can see the archive
directory piling up 16MB every 2 minutes.
Maybe the pgarchivecleanup is the solution to cleanup the contents of
archive folder?
How to properly do it?
What is the pgarchivecleanup example that I can use for this case?
How to run a dry-run for pgarchivecleanup?


Best Regards,
FR

On 10/12/2015, Joshua D. Drake  wrote:
> On 12/09/2015 04:38 PM, FattahRozzaq wrote:
>> Quick information,
>>
>> After I realize, the line "archive_command=/bin/true" is a bad
>> decision, I have revert it back.
>> Now I'm really confused and panic.
>> I don't know what to do, and I don't really understand the
>> postgresql.conf
>> I'm a network engineer, I should handle the network and also
>> postgresql database.
>> Oh man, the office is so good but this part is sucks :((
>
> If the pg_xlog directory is growing it is likely that either:
>
> * wal_keep_segments is set high and your slave is not correctly
> receiving updates.
>
> * You are using a replication slot and the slave is not correctly
> receiving updates.
>
> If your archive_command does not return a success, your pg_xlog will
> also grow but you don't need the archive_command *IF* your streaming
> replication is working *UNLESS* you are also doing archiving or PITR.
>
> Sincerely,
>
> JD
>
> --
> Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
> PostgreSQL Centered full stack support, consulting and development.
> Announcing "I'm offended" is basically telling the world you can't
> control your own emotions, so everyone else should do it for you.
>


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-09 Thread FattahRozzaq

Quick information,

After I realize, the line "archive_command=/bin/true" is a bad
decision, I have revert it back.
Now I'm really confused and panic.
I don't know what to do, and I don't really understand the postgresql.conf
I'm a network engineer, I should handle the network and also
postgresql database.
Oh man, the office is so good but this part is sucks :((

--
On 10/12/2015, FattahRozzaq  wrote:
> Hi John,
>
> I really don't know why I should keep the wal archives.
> I implement streaming replication into 1 server (standby server).
> I'm really newbie to PostgreSQL but the boss pushed me to handle it
> and implement it in production

Re: [GENERAL] HELP!!! The WAL Archive is taking up all space

2015-12-09 Thread John R Pierce


On 12/8/2015 4:55 PM, FattahRozzaq wrote:

...I want to limit the total size use by WAL archive to around 200-400 GB...?


for what purpose are you keeping a wal archive ?

if its for PITR (point in time recovery), you need ALL WAL records since 
the start of a base backup up to the point in time at which you wish to 
recover.



--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help me recovery databases.

2015-06-01 Thread Evi-M

Thank you very much. Well done. Backups it's all)) 01.06.2015, 03:05, "Melvin Davidson" melvin6...@gmail.com:If you have a pg_dumpall, or a pg_dump of your databases, you "might" be able to get your data back by doing the following.1. If your data directory is corrupted or still exists, rename it.2. Make copies of your postgresql.conf pg_hba.conf if you still have them.3. use initdb to recreate the data directory4. Start PostgreSQL and create the database(s) you need5, Restore your data from pg_dumpall or pg_dump's.6. If step 5 works, replace the new pg_hba.conf with the old copy if you have it.On Sun, May 31, 2015 at 7:38 PM, Tomas Vondra tomas.von...@2ndquadrant.com wrote:"base" is where all the data files are located, so the answer is most likely 'no'. On 05/31/15 15:11, Evi-M wrote:Good day, Anyone. I lost folders with /base pg_xlog and pg_clog mount another hard disk.(500gb) This is Postgresql 9.1, Ubuntu 12.04 Could i restore databases without /base? I have archive_status folder. -- С Уважением,Генералов Юрий -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Melvin DavidsonI reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you. -- www.help-tec.ruwww.хелп-тек.рфС Уважением,Генералов Юрий

Re: [GENERAL] Help me recovery databases.

2015-05-31 Thread Melvin Davidson

If you have a pg_dumpall, or a pg_dump of your databases, you might be
able to get your data back by doing the following.

1. If your data directory is corrupted or still exists, rename it.
2. Make copies of your postgresql.conf  pg_hba.conf if you still have them.
3. use initdb to recreate the data directory
4. Start PostgreSQL and create the database(s) you need
5, Restore your data from pg_dumpall or pg_dump's.
6. If step 5 works, replace the new pg_hba.conf with the old copy if you
have it.

On Sun, May 31, 2015 at 7:38 PM, Tomas Vondra tomas.von...@2ndquadrant.com
wrote:

 base is where all the data files are located, so the answer is most
 likely 'no'.

 On 05/31/15 15:11, Evi-M wrote:

 Good day, Anyone.
 I lost folders with /base
 pg_xlog and pg_clog mount another hard disk.(500gb)
 This is Postgresql 9.1, Ubuntu 12.04
 Could i restore databases without /base?
 I have archive_status folder.
 --
 С Уважением,Генералов Юрий


 --
 Tomas Vondra  http://www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: [GENERAL] Help me recovery databases.

2015-05-31 Thread Tomas Vondra

base is where all the data files are located, so the answer is most 
likely 'no'.


On 05/31/15 15:11, Evi-M wrote:

Good day, Anyone.
I lost folders with /base
pg_xlog and pg_clog mount another hard disk.(500gb)
This is Postgresql 9.1, Ubuntu 12.04
Could i restore databases without /base?
I have archive_status folder.
--
С Уважением,Генералов Юрий


--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with slow table update

2015-04-19 Thread Tim Uckun

On Sat, Apr 18, 2015 at 10:24 AM, Pawel Veselov pawel.vese...@gmail.com
wrote:

 I found some dangling prepared transactions



How do you find and remove these?

Re: [GENERAL] Help with slow table update

2015-04-19 Thread Jim Nasby


On 4/19/15 9:53 PM, Tim Uckun wrote:


On Sat, Apr 18, 2015 at 10:24 AM, Pawel Veselov pawel.vese...@gmail.com
mailto:pawel.vese...@gmail.com wrote:

I found some dangling prepared transactions



How do you find and remove these?


SELECT * FROM pg_prepared_xacts;
ROLLBACK PREPARED xid;
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help with slow table update

2015-04-17 Thread Pawel Veselov


 [skipped]


  But remember that if you update or delete a row, removing it from an
 index, the data will stay in that index until vacuum comes along.

 Also, there's no point in doing a REINDEX after a VACUUM FULL;
 vacuum full rebuilds all the indexes for you.


 I was being desperate :)

 I still think there is something very wrong with this particular table.
 First, I have production systems that employ this function on way larger
 data set, and there is no problem (so far, but still). This machine is
 part of a test deployment, there is no constant load, the only data that
 is being written now is when I do these tests. Vacuuming should prune
 all that dead stuff, and if it's absent, it's unclear where is the time
 spent navigating/updating the table with 24 rows :)


 I think you definitely have a problem with dead rows, as evidenced by the
 huge improvement VACUUM FULL made.


 But it's not clear why (and not reasonable, IMHO, that) it wouldn't
 improve past current point.


What I should've done is 'VACUUM FULL VERBOSE'. Once I did, it told me
there were 800k dead rows that can't be removed. After digging around I
found some dangling prepared transactions, going back months. Once I threw
those away, and re-vacuumed, things got back to normal.

Thanks for all your help and advice.

Re: [GENERAL] Help with slow table update

2015-04-15 Thread Igor Neyman

From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Pawel Veselov
Sent: Tuesday, April 14, 2015 8:01 PM
To: Jim Nasby
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Help with slow table update

[skipped]

This is where using sets becomes really tedious, as Postgres severely lacks an 
upsert-like statement.
I don't think there are joins allowed in UPDATE statement, so I will need to 
use WITH query, right?
Also, I'm not sure how LEFT JOIN will help me isolate and insert missed 
entries...

Would it be OK to replace upsert part with merging into a temp table, then 
deleting and inserting from temp table? Is there any penalty for insert/delete 
comparing to update?

[skipped]

Yes, you can do UPDATE with joins 
(http://www.postgresql.org/docs/9.4/static/sql-update.html) like this:

UPDATE table1 A SET col1 = B.col2
  FROM table2 B
  WHERE A.col3 = B.col4;

Regards,
Igor Neyman

Re: [GENERAL] Help with slow table update

2015-04-15 Thread Pawel Veselov


 [skipped]



 This is where using sets becomes really tedious, as Postgres severely
 lacks an upsert-like statement.

 I don't think there are joins allowed in UPDATE statement, so I will need
 to use WITH query, right?

 Also, I'm not sure how LEFT JOIN will help me isolate and insert missed
 entries...



   [skipped]



 Yes, you can do UPDATE with joins (
 http://www.postgresql.org/docs/9.4/static/sql-update.html) like this:



 UPDATE table1 A SET col1 = B.col2

   FROM table2 B

   WHERE A.col3 = B.col4;



I meant using JOIN operator in the update. But it's still possible, though
through WITH query.

Re: [GENERAL] Help with slow table update

On Mon, Apr 13, 2015 at 6:03 PM, David G. Johnston 
david.g.johns...@gmail.com wrote:

 On Mon, Apr 13, 2015 at 5:01 PM, Pawel Veselov pawel.vese...@gmail.com
 wrote:


 r_agrio_hourly - good, r_agrio_total - bad.

  Update on r_agrio_hourly  (cost=0.42..970.32 rows=250 width=329) (actual
 time=2.248..2.248 rows=0 loops=1)
-  Index Scan using u_r_agrio_hourly on r_agrio_hourly
  (cost=0.42..970.32 rows=250 width=329) (actual time=0.968..1.207 rows=1
 loops=1)
  Index Cond: ((tagid = 1002::numeric) AND (unitid =
 1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND
 (device_type = 3::numeric) AND (placement = 2::numeric))
  Total runtime: 2.281 ms
  Update on r_agrio_total  (cost=0.42..45052.56 rows=12068 width=321)
 (actual time=106.766..106.766 rows=0 loops=1)
-  Index Scan using u_r_agrio_total on r_agrio_total
  (cost=0.42..45052.56 rows=12068 width=321) (actual time=0.936..32.626
 rows=1 loops=1)
  Index Cond: ((tagid = 1002::numeric) AND (unitid =
 1002::numeric) AND (device_type = 3::numeric) AND (placement = 2::numeric))
  Total runtime: 106.793 ms


 What it is you expect to see here?

 What are the results (count and times) for:

 SELECT count(*) FROM r_agrio_total WHERE tagid = 1002 and unitid = 1002;


Result: 8 (the whole table is 24 rows). It returns somewhat with a stumble,
but relatively quickly.
db= explain analyze SELECT count(*) FROM r_agrio_total WHERE tagid = 1002
and unitid = 1002;
   QUERY PLAN

-
 Aggregate  (cost=4.45..4.46 rows=1 width=0) (actual time=327.194..327.195
rows=1 loops=1)
   -  Index Scan using tag_r_agrio_total on r_agrio_total
 (cost=0.42..4.45 rows=1 width=0) (actual time=0.039..327.189 rows=8
loops=1)
 Index Cond: (tagid = 1002::numeric)
 Filter: (unitid = 1002::numeric)
 Total runtime: 327.228 ms


 SELECT count(*) FROM r_agrio_hourly WHERE tagid = 1002 and unitid = 1002;


Result is 2869. Returns somewhat quckly. Explain analyze is crazy though:
db= explain analyze SELECT count(*) FROM r_agrio_hourly WHERE tagid = 1002
and unitid = 1002;

 QUERY PLAN


 Aggregate  (cost=68134.68..68134.69 rows=1 width=0) (actual
time=15177.211..15177.211 rows=1 loops=1)
   -  Index Scan using adunit_r_agrio_hourly on r_agrio_hourly
 (cost=0.42..67027.10 rows=443035 width=0) (actual time=0.096..15175.730
rows=2869 loops=1)
 Index Cond: (unitid = 1002::numeric)
 Filter: (tagid = 1002::numeric)
 Total runtime: 15177.240 ms

More queries along this line might be needed.  The underlying question is
 how many index rows need to be skipped over on total to get the final
 result - or rather are the columns in the index in descending order of
 cardinality?


Idea is - both tables have unique multi-field indices, and each update hits
exactly one row from that index, no more, and all fields from the index are
locked with equality condition on the update. All of the updates (within a
transaction) would always work on a small subset of rows (max a few
hundred, ever; in this case, it's may be around 10). I expect it to be
possible for the server to keep the active working set in the cache at all
times. Since the index is unique, there shouldn't be a reason to re-scan
the table, if a cached row is found, no?


 Any chance you can perform a REINDEX - maybe there is some bloat
 present?  There are queries to help discern if that may be the case, I do
 not know then off the top of my head, but just doing it might be acceptable
 and is definitely quicker if so.


That's the thing - I've done both vacuum full, and re-index. The very first
time I did vacuum full things improved (60 seconds to 7 seconds). Re-index
didn't improve anything (but it was done after vacuum full).


 I'm still not really following your presentation but maybe my thoughts
 will spark something.


Thank you! I hope I clarified this some :)

Re: [GENERAL] Help with slow table update

On Mon, Apr 13, 2015 at 7:37 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 4/13/15 7:01 PM, Pawel Veselov wrote:

 Cursors tend to make things slow. Avoid them if you can.


 Is there an alternative to iterating over a number of rows, where a
 direct update query is not an option?

 I really doubt that either the actual processing logic, including use of
 types has anything to do with my problem. This is based on the fact that
 out of the tables that are being changed, only one is exhibiting the
 problem. All of the involved tables have nearly the same structure, and
 have the same logical operations performed on them. I thought may be the
 bad table is slow because it was first in the list, and Postgres was
 caching the functions results, but I moved things around, and pattern is
 the same.


 I'm guessing that you're essentially processing a queue. Take a look at
 http://www.postgresql.org/message-id/552c750f.2010...@bluetreble.com for
 some ideas. Basically, not only do cursors have non-trivial overhead, doing
 a ton of single-row queries is going to have a non-trivial overhead itself.


Thank you for the pointers. PgQ sounds interesting, it has to be remote for
RDS (I use RDS), but I'll try implementing a solution based on it.
However, for all the times that is being spent during this update, the
breakdown is:

update total table: 10.773033
update hourly table: 00.179711
update daily table: 01.082467
update some other table (actually, it has cardinality similar to total
table): 00.168287
clean the queue table: 00.021733
overhead: 00.014922

The overhead is time taken to run the whole procedure, minus all these
other times that have been counted.

(some notes about the daily table below)


  As for your specific question, I suggest you modify the plpgsql
 function so that it's doing an EXPLAIN ANALYZE on the slow table.
 EXPLAIN ANALYZE actually returns a recordset the same way a SELECT
 would, with a single column of type text. So you just need to do
 something with that output. The easiest thing would be to replace
 this in your function:

 UPDATE slow_table SET ...

 to this (untested)

 RETURN QUERY EXPLAIN ANALYZE UPDATE slow_table SET ...

 and change the function so it returns SETOF text instead of whatever
 it returns now.


 Thank you, that made it a lot easier to see into what's really going on.
 But the outcome is somewhat the same. The bad table analysis shows a
 very high cost, and thousands of rows, where the table contains only 24
 rows. This time, however, the actual run time is shown, and one can see
 where the time is spent (I was using just a sum of clock_time()s around
 the update statements to see where the problem is).

 r_agrio_hourly - good, r_agrio_total - bad.

   Update on r_agrio_hourly  (cost=0.42..970.32 rows=250 width=329)
 (actual time=2.248..2.248 rows=0 loops=1)
   -  Index Scan using u_r_agrio_hourly on r_agrio_hourly
   (cost=0.42..970.32 rows=250 width=329) (actual time=0.968..1.207
 rows=1 loops=1)
   Index Cond: ((tagid = 1002::numeric) AND (unitid =
 1002::numeric) AND ((rowdate)::text = '2015-04-09T23'::text) AND
 (device_type = 3::numeric) AND (placement = 2::numeric))
   Total runtime: 2.281 ms
   Update on r_agrio_total  (cost=0.42..45052.56 rows=12068 width=321)
 (actual time=106.766..106.766 rows=0 loops=1)
   -  Index Scan using u_r_agrio_total on r_agrio_total
   (cost=0.42..45052.56 rows=12068 width=321) (actual time=0.936..32.626
 rows=1 loops=1)
   Index Cond: ((tagid = 1002::numeric) AND (unitid =
 1002::numeric) AND (device_type = 3::numeric) AND (placement =
 2::numeric))
   Total runtime: 106.793 ms


 Keep in mind that the estimated cost is not terribly useful; it's the
 actual times that matter.

 I suspect what's happening here is a combination of things. First, the
 hourly table is basically living in cache, but the total table is not. That
 means that when you go to find a row in the total table you're actually
 hitting the disk instead of pulling the data from memory.



 Second, you may have a lot of dead rows in the total table. I suspect this
 because of the very large amount of time the index scan is taking. Unless
 you're running on an old 10MB MFM drive you'd be pretty hard pressed for
 even 2 IO operations (one for the index leaf page and one for the heap
 page) to take 32ms. I suspect the index scan is having to read many dead
 rows in before it finds a live one, and incurring multiple IOs. Swiching to
 EXPLAIN (analyze, buffers) would help confirm that.


That looks most likely to me as well. Most of the updates in a single
batch, for the total table would be on the same record, while for hourly
table it's a lot less. Logically, the tables contain identical data, except
that hourly table breaks it down per hour, and total table contains the
data for all times. The daily table contains the same data per day.

So, if I compared the tables, the total table has the

Re: [GENERAL] Help with slow table update

On Tue, Apr 14, 2015 at 3:29 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 4/14/15 4:44 PM, Pawel Veselov wrote:

 On Tue, Apr 14, 2015 at 1:15 PM, Jim Nasby jim.na...@bluetreble.com
 mailto:jim.na...@bluetreble.com wrote:

 On 4/14/15 1:28 PM, Pawel Veselov wrote:


 I wonder if what I need to do, considering that I update a lot
 of the
 same rows as I process this queue, is to create a temp table,
 update
 the rows there, and then update the actual tables once at the
 end...


 That's what I'd do.


 Well, in short, I changed (repeat the body of loop for how many tables
 are there)

 LOOP (item)
UPDATE table with item
IF not found INSERT item INTO table; END IF;
 END LOOP;

 to:

 CREATE TEMP TABLE xq_table (like table) on commit drop;
 LOOP (item)
LOOP
  UPDATE xq_table with item;
  exit when found;
  INSERT INTO xq_table select * from table for update;
  continue when found;
  INSERT item INTO xq_table;
  exit;
END LOOP;
 END LOOP;
 UPDATE table a set (rows) = (xq.rows)
FROM xq_table xq
WHERE (a.keys) = (xq.keys)

 That works significantly faster. The final update statement is very
 fast. The process is somewhat slow in the beginning as it sucks in
 records from total into xq_total, but once all of that is moved into
 the temp table, it rushes through the rest.


 Databases like to think in sets. It will generally be more efficient to do
 set operations instead of a bunch of row-by-row stuff.

 Since you're pulling all of this from some other table your best bet is
 probably something like:

 CREATE TEMP TABLE raw AS DELETE FROM queue WHERE ... RETURNING *;

 CREATE TEMP VIEW hourly_v AS SELECT ... FROM raw GROUP BY;
 UPDATE ar_hourly SET ... FROM hourly_v JOIN ...;
 INSERT INTO ar_hourly SELECT FROM hourly_v LEFT JOIN ar_hourly ON ...;

 -- Same thing for daily
 -- Same thing for total


In my previous post, there was a problem with that pseudo-code, as it's
missing inserts into the final table at the end of loop, for those records
that need to be inserted and not updated.

This is where using sets becomes really tedious, as Postgres severely lacks
an upsert-like statement.
I don't think there are joins allowed in UPDATE statement, so I will need
to use WITH query, right?
Also, I'm not sure how LEFT JOIN will help me isolate and insert missed
entries...

Would it be OK to replace upsert part with merging into a temp table, then
deleting and inserting from temp table? Is there any penalty for
insert/delete comparing to update?

[skipped]


  But remember that if you update or delete a row, removing it from an
 index, the data will stay in that index until vacuum comes along.

 Also, there's no point in doing a REINDEX after a VACUUM FULL;
 vacuum full rebuilds all the indexes for you.


 I was being desperate :)

 I still think there is something very wrong with this particular table.
 First, I have production systems that employ this function on way larger
 data set, and there is no problem (so far, but still). This machine is
 part of a test deployment, there is no constant load, the only data that
 is being written now is when I do these tests. Vacuuming should prune
 all that dead stuff, and if it's absent, it's unclear where is the time
 spent navigating/updating the table with 24 rows :)


 I think you definitely have a problem with dead rows, as evidenced by the
 huge improvement VACUUM FULL made.


But it's not clear why (and not reasonable, IMHO, that) it wouldn't improve
past current point.

Re: [GENERAL] Help with slow table update