from:"J"

Re: [GENERAL] Postgres 10.1 fails to start: server did not start in time

2017-11-12 Thread Peter J. Holzer

On 2017-11-12 13:26:58 +0100, Christoph Berg wrote:
> Re: To Adam Brusselback 2017-11-11 
> <2017205316.u56lkmkakdmcx...@msg.df7cb.de>
> > I'm investigating if it's a good idea to tell systemd to ignore the
> > exit code of pg_ctl(cluster).
> 
> Telling systemd to ignore ExecStart errors seems to be the correct
> solution. The service will still be active, with the startup error
> being shown:

Wouldn't it be better to remove the timeout? If some other service
depends on PostgreSQL it probably shouldn't be startet until PostgreSQL
is really up and services which don't need PostgreSQL (e.g. SSH or X11
login or a web- or mail server) shouldn't depend on it.

One of the purported advantages of systemd over SystemV init is that it
starts up services in parallel, so a service which takes a long (or
infinite) time to start doesn't block other services.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

signature.asc
Description: PGP signature

Re: [GENERAL] Client Authentication methods

2017-11-10 Thread Peter J. Holzer

On 2017-11-10 08:25:24 -0500, chiru r wrote:
> I am trying to understand the Authentication method in pg_hba.conf file
> (password & md5) in PostgreSQL database server. 
> 
> I am assuming that the user provides the  connection string host/usser/
> password,then client will go and contact the DB server pg_hba.conf file in
> memory without carrying password over the network initially, and then it
> confirms the authentication method from pg_hba.conf ,then it decides weather 
> it
> send clear text or md5 encrypted password from client to Server to make a
> session?

I'm not sure what "it" refers to in this sentence. If "it" refers to the
client (as grammatically it should) then the answer is no. The client
doesn't have access to the pg_hba.conf file.

The client connects to the server, sending the username and database
name, but not (yet) the password. Then the server checks the pg_hba.conf
file to determine which authentication method to use. The server then
sends an authentication request to the client, to which the client sends
a response (including, or based on, the password).

> Is my assumption is correct ? or What exactly it make the difference for 
> client
> if i use md5/password  in pg_hba.conf file in DB server?.

See
https://www.postgresql.org/docs/10/static/auth-methods.html#AUTH-PASSWORD

With method password, passwords are sent in plain text. With md5, an md5
hash of the password, the username, and a nonce is sent instead.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

signature.asc
Description: PGP signature

Re: [GENERAL] EXPLAIN command just hangs...

2017-11-03 Thread Peter J. Holzer

On 2017-11-02 20:51:23 +, Rhhh Lin wrote:
[...]
> where timestamp BETWEEN 150667656 AND 150875022 
[...]
> *Also, as a sidenote - can someone please expand on why one (I was not 
> involved
> in the creation of this DB/schema definition) would choose to have the
> definition of the timestamp column as a bigint in this case?  

The numbers look like Java timestamps (Milliseconds since the epoch). So
probably the programs accessing the database are written in Java and the
programmer decided that it's simpler to do all timestamp computations in
Java than in SQL. Or maybe the programmer just felt more comfortable
with Java-like timestamps than with calendar dates. (I have done the
same (with Unix timestamps, i.e. seconds since the epoch). Although in
the cases where I've done it I had the additional excuse that the
database didn't support timestamps with timezones, which isn't the case
for PostgreSQL.)

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

signature.asc
Description: PGP signature

[GENERAL] Query plan for Merge Semi Join

2017-11-01 Thread Peter J. Holzer

  ║
║ Execution time: 3275.341 ms   

 ║
╚╝

That is almost certainly not ideal, but this is not my question.

My question is what does that merge semi join actually do?

In general a merge join needs two inputs sorted by the merge key. It
walks both in parallel and joins matching lines. Correct?

The first input is the index scan. 

The second is the output of the materialize. Since we need only the
column arbeitsvolumen this would be something like
select arbeitsvolumen from facttable_kon_eh where thema='E'
order by arbeitsvolumen;

So far so good. But there are a few things I don't understand:

Where does Rows Removed by Filter: 3874190 come from? The number doesn't
match any count I can come up with: It is a bit larger than the total number
of rows where term is not null but smaller than the total number of
rows where the filter doesn't match. And it is much larger than the
number of rows I would expect if the merge stopped once there could not
be a possible match any more. And does it really check the filter
condition even for rows that don't satisfy the merge condition? Of
course it makes sense from a modularization point of view, but that's a
lot of random accesses, most of which are unneccessary.

The materialize returns 184791 rows. This one I understand: There are 6
non-null distinct values of arbeitsvolumen in facttable_kon_eh, and each
appears 36958 times. 36958 * 5 + 1 = 184791. So it stops once it reaches
the largest value. Although now I'm wondering how it knows that this is
the largest value without scanning to the end).

hp

- -
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>


signature.asc
Description: PGP signature

Re: [GENERAL] "Shared strings"-style table

2017-10-13 Thread Peter J. Holzer

On 2017-10-13 12:49:21 -0300, Seamus Abshere wrote:
> In the spreadsheet world, there is this concept of "shared strings," a
> simple way of compressing spreadsheets when the data is duplicated in
> many cells.
> 
> In my database, I have a table with >200 million rows and >300 columns
> (all the households in the United States). For clarity of development
> and debugging, I have not made any effort to normalize its contents, so
> millions of rows have, for example, "SINGLE FAMILY RESIDENCE /
> TOWNHOUSE" (yes, that whole string!) instead of some code representing
> it.
> 
> Theoretically / blue sky, could there be a table or column type that
> transparently handles "shared strings" like this, reducing size on disk
> at the cost of lookup overhead for all queries?

Theoretically it's certainly possible and I think some column-oriented
databases store data that way.

> (I guess maybe it's like TOAST, but content-hashed and de-duped and not
> only for large objects?)

Yes, but if you want to autmatically delete entries which are no longer
needed you need to keep track of that. So either a reference count or an
index lookup on the parent table. This is starting to look a lot like a
foreign key - just hidden from the user. Performance would probably be
similar, too.

We have done something similar (although for different reasons). We
ended up doing the "join" in the application. For most purposes we don't
need the descriptive strings and when we need them we can do a
dictionary lookup just before sending them to the client (of course the
dictionary has to be read from the database, too, but it doesn't change
that often, so it can be cached). And from a software maintainability
POV I think a dictionary lookup in Perl is a lot nicer than 50 joins
(or 300 in your case).

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

signature.asc
Description: PGP signature

Re: [GENERAL] pglogical bidirectional replication of sequences

2017-09-01 Thread Peter J. Holzer

On 2017-09-01 10:29:51 +0200, Peter J. Holzer wrote:
> pglogical supports replication of sequences, and although the way it
> does this suggests that it can't really work in both directions
> (actually I'm sceptical that it works reliably in one direction), of
> course I had to try it.
> 
> So I created a sequence on both nodes and called
> select pglogical.replication_set_add_sequence('default', 'test_sequence');
> on both nodes.
> 
> The result was ... interesting.
> 
> First I got the same sequence (1, 2, 3, 4, 5) on both nodes.
> 
> After a few seconds the replication kicked in, and then I got the same
> value (1005) on both nodes most of the time, with a few variants (2005,
> 3005) thrown in.
> 
> In a word, the sequence was completely unusable.

[...some failed attempts to recover...]

> So, is there a way to recover from this situation without drastic
> measures like nuking the whole database.

To answer my own question:

delete from pglogical.queue where message_type='S';
on both nodes seems to have the desired effect.
A vacuum full pglogical.queue afterwards is a good idea to get the
bloated table back to a reasonable size.

hp



-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>


signature.asc
Description: Digital signature

Re: [GENERAL] pglogical bidirectional replication of sequences

2017-09-01 Thread Peter J. Holzer

On 2017-09-01 09:57:52 -0600, Rob Sargent wrote:
> On 09/01/2017 02:29 AM, Peter J. Holzer wrote:
> >TLDR: Don't.
> >
> >I'm currently conducting tests which should eventually lead to a 2 node
> >cluster with working bidirectional logical replication.
> >
> >(Postgresql 9.6.4-1.pgdg90+1, pglogical 2.0.1-1.jessie+1 on Debian 9
> >(Stretch))
> >
> >pglogical supports replication of sequences, and although the way it
> >does this suggests that it can't really work in both directions
> >(actually I'm sceptical that it works reliably in one direction), of
> >course I had to try it.
> >
[and it blew up]

> I trust you mean don't use sequences

I trust you don't mean what I understood ;-).

Seriously:

Sequences in general are fine and very useful. I think they should be
used where appropriate.

Sequences and logical replication don't mix well. That still doesn't
mean that you can't use sequences, you just have to be careful how you
use them. 

Since replicating sequence state doesn't really work, I think it is best
to use independent sequences on each node and just configure them in a
way that they can not produce the same values. A naive approach would be
to use MINVALUE/MAXVALUE/START WITH to ensure non-overlapping ranges. A
somewhat more elegant approach is to increment by $n$ (the number of
nodes in the cluster) and use different start values (I got that idea
from
http://thedumbtechguy.blogspot.co.at/2017/04/demystifying-pglogical-tutorial.html).

There are other ways to get unique ids: A uuid should work pretty well
in most cases, and in some even a random 64 bit int might be enough.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

signature.asc
Description: Digital signature

[GENERAL] pglogical bidirectional replication of sequences

2017-09-01 Thread Peter J. Holzer

TLDR: Don't.

I'm currently conducting tests which should eventually lead to a 2 node
cluster with working bidirectional logical replication.

(Postgresql 9.6.4-1.pgdg90+1, pglogical 2.0.1-1.jessie+1 on Debian 9
(Stretch))

pglogical supports replication of sequences, and although the way it
does this suggests that it can't really work in both directions
(actually I'm sceptical that it works reliably in one direction), of
course I had to try it.

So I created a sequence on both nodes and called
select pglogical.replication_set_add_sequence('default', 'test_sequence');
on both nodes.

The result was ... interesting.

First I got the same sequence (1, 2, 3, 4, 5) on both nodes.

After a few seconds the replication kicked in, and then I got the same
value (1005) on both nodes most of the time, with a few variants (2005,
3005) thrown in.

In a word, the sequence was completely unusable.

Experiment completed, so I removed the sequence from the replication
set:

select pglogical.replication_set_remove_sequence('default', 'test_sequence');
on both nodes.

But the behaviour of the sequence doesn't change. It still returns 1005
most of the time, and sometimes 2005 or 3005. This is true even after
restarting both nodes. 

Plus, I can't drop the sequence any more (as the user who created the
sequence):

wds=> drop sequence public.test_sequence ;
ERROR:  permission denied for schema pglogical

So, clearly, pglogical is still managing that sequence.

If I drop the sequence as postgres and then recreate it, it works
normally for some time (also the sequence on the other node now works
normally), but after some time, the replication kicks in again and the
sequence is stuck again at 1005. 

So, is there a way to recover from this situation without drastic
measures like nuking the whole database.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>


signature.asc
Description: Digital signature

Re: [GENERAL] Porting libpq to QNX 4.25

2017-08-25 Thread Peter J. Holzer

On 2017-08-22 12:57:15 -0300, marcelo wrote:
> We'll replace those QNX machines with WIndows XP ones

The future is already here — it's just not very evenly distributed.

SCNR,
hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>


signature.asc
Description: Digital signature

[GENERAL] Postgresql_for_odoo

2017-08-24 Thread Fadhel J Muhammad

Service Postgresql_for_odoo not found and Server internal error while open
localhost:8069. If I uninstall postgresql, Error stopping and delete
postgresql_for_odoo. Thanks

Re: [GENERAL] Count column with name 'count' returns multiple rows. Why?

2017-08-19 Thread Peter J. Holzer

On 2017-08-18 15:57:39 -0500, Justin Pryzby wrote:
> On Fri, Aug 18, 2017 at 10:47:37PM +0200, Peter J. Holzer wrote:
> > On 2017-08-18 06:37:15 -0500, Justin Pryzby wrote:
> > > On Fri, Aug 18, 2017 at 01:01:45PM +0200, Rob Audenaerde wrote:
> > > > Can anyone please explain this behaviour?
> > > 
> > > https://www.postgresql.org/docs/9.6/static/sql-expressions.html#SQL-EXPRESSIONS-FUNCTION-CALLS
> > > https://www.postgresql.org/docs/9.6/static/rowtypes.html#ROWTYPES-USAGE
> > 
> > Maybe I overlooked it, but I don't see anything in those pages which
> > explains why «count» is parsed as a column name in the first example and
> > as a function name in the second.
> > 
> > Nor do I see what «count(base.*)» is supposed to mean. It seems to be
> > completely equivalent to just writing «count», but the part in
> > parentheses is not ignored: It has to be either the table name or the
> > table name followed by «.*». Everything else I tried either led to a
> > syntax error or to «count» being recognized as a function. So apparently
> > columnname open-parenthesis tablename closed-parenthesis is a specific
> > syntactic construct, but I can't find it documented anywhere.
> 
> | Another special syntactical behavior associated with composite values is 
> that
> |we can use functional notation for extracting a field of a composite value. 
> The
> |simple way to explain this is that the notations field(table) and table.field
> |are interchangeable. For example, these queries are equivalent:

Thanks. I see it now.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>


signature.asc
Description: Digital signature

Re: [GENERAL] Count column with name 'count' returns multiple rows. Why?

2017-08-18 Thread Peter J. Holzer

On 2017-08-18 06:37:15 -0500, Justin Pryzby wrote:
> On Fri, Aug 18, 2017 at 01:01:45PM +0200, Rob Audenaerde wrote:
> > I don't understand why this query:
> > 
> >select count(base.*) from mytable base;
> > 
> > does return multiple rows.
> > 
> >select count(1) from mytable base;
> > 
> > returns the proper count.
> > 
> > There is a column with the name 'count'.
> > 
> > Can anyone please explain this behaviour?
> 
> https://www.postgresql.org/docs/9.6/static/sql-expressions.html#SQL-EXPRESSIONS-FUNCTION-CALLS
> https://www.postgresql.org/docs/9.6/static/rowtypes.html#ROWTYPES-USAGE

Maybe I overlooked it, but I don't see anything in those pages which
explains why «count» is parsed as a column name in the first example and
as a function name in the second.

Nor do I see what «count(base.*)» is supposed to mean. It seems to be
completely equivalent to just writing «count», but the part in
parentheses is not ignored: It has to be either the table name or the
table name followed by «.*». Everything else I tried either led to a
syntax error or to «count» being recognized as a function. So apparently
columnname open-parenthesis tablename closed-parenthesis is a specific
syntactic construct, but I can't find it documented anywhere.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

signature.asc
Description: Digital signature

Re: [GENERAL] Dealing with ordered hierarchies

2017-07-31 Thread Peter J. Holzer

On 2017-07-25 01:15:56 +1200, Tim Uckun wrote:
> I don't like the approach with a large increment. It would mean complicated
> logic to see if you filled the gap and then update all the other peers if you
> did. It sounds like the re-order is going to be expensive no matter what. My
> primary concern are race conditions though. What if two or more users are
> trying to update the hierarchy either by inserts or updates? I can definitely
> see a situation where we have issues transactions trip over each other.

You could add a unique index over (parent, sequence_number). That way
two transactions won't be able to add a node with the same sequence
number under the same parent. You will have to handle duplicate key
errors, though.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>


signature.asc
Description: Digital signature

Re: [GENERAL] plpgsql function with offset - Postgres 9.1

2017-06-15 Thread Peter J. Holzer

On 2017-06-16 10:19:45 +1200, Patrick B wrote:
> 2017-05-29 19:27 GMT+12:00 Albe Laurenz <laurenz.a...@wien.gv.at>:
> Patrick B wrote:
> > I am running a background task on my DB, which will copy data from 
> tableA
> to tableB. For
> > that, I'm writing a PL/PGSQL function which basically needs to do the
> following:
> >
> >
> > 1.    Select the data from tableA
> > 2.    The limit will be put when calling the function
> > 3.    insert the selected data on Step 1 onto new table
[...]
> >               FOR row IN EXECUTE '
> >                           SELECT
> >                                   id,
> >                                   path,
> >                                   name,
> >                                   name_last,
> >                                   created_at
> >                           FROM
> >                                   tablea
> >                           WHERE
> >                                   ready = true
> >                           ORDER BY 1 LIMIT ' || rows || ' OFFSET ' ||
> rows || ''
> 
> '... LIMIT ' || p_limit || ' OFFSET ' || p_offset
> 
> >               LOOP
> 
> num_rows := num_rows + 1;
>
> >               INSERT INTO tableB (id,path,name,name_last,created_at)
> >               VALUES (row.id,row.path,row.name,row.
> name_last,row.created_at);
> >
> >               END LOOP;
[...]
> 
> There are two problems with this approach:
> 
> 1. It will do the wrong thing if rows are added or deleted in "tablea"
> while
>    you process it.
> 
> 
> 
> There will be actually records being inserted in tablea while processing the
> migration Any ideas here?

Is id monotonically increasing? You might be able to use that, as Albe
suggests:

> The solution is to avoid OFFSET and to use "keyset pagination":
> http://use-the-index-luke.com/no-offset

But it works only if rows cannot become ready after their id range has
already been processed. Otherwise you will miss them.

> I can add another column in tablea, like example: row_migrated boolean --> if
> that helps

Yes that's probably the best way. Instead of using an additional column
you could also make ready tristate: New -> ready_for_migration -> migrated.

> 2. Queries with hight OFFSET values have bad performance.
> 
> 
> No problem. The plan is to perform 2k rows at once, which is not much.

Are rows deleted from tablea after they are migrated? Otherwise you will
have a problem:

select ... limit 2000 offset 1234000

will have to retrieve 1236000 rows and then discard 1234000 of them.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>


signature.asc
Description: Digital signature

Re: [GENERAL] storing large files in database - performance

2017-05-16 Thread Peter J. Holzer

On 2017-05-16 12:25:03 +, Eric Hill wrote:
> I searched and found a few discussions of storing large files in the database
> in the archives, but none that specifically address performance and how large
> of files can realistically be stored in the database.
> 
>  
> 
> I have a node.js application using PostgreSQL to store uploaded files.  The
> column in which I am storing the file contents is of type “bytea” with
> “Storage” type set to “EXTENDED”.

I have mentioned this little experiment before, but I finally put the
results on my web site: https://www.hjp.at/databases/blob-bench/

(Please note that so far I have run this only on one system.
Generalizing to other systems might be premature).


> Storing a 12.5 MB file is taking 10 seconds, and storing a 25MB file
> is taking 37 seconds.  Two notable things about those numbers:  It
> seems like a long time, and the time seems to grow exponentially with
> file size rather than linearly.
> 
>  
> 
> Do these numbers surprise you?

Yes. on my system, storing a 25 MB bytea value takes well under 1 second.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>


signature.asc
Description: Digital signature

Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis, amqp, s3 in the future?

2017-05-05 Thread Peter J. Holzer

On 2017-05-05 11:46:55 -0700, John R Pierce wrote:
> On 5/5/2017 11:28 AM, Peter J. Holzer wrote:
> 
> On 2017-05-04 23:08:25 +0200, Sven R. Kunze wrote:
> 
> On 03.05.2017 12:57, Thomas Güttler wrote:
> 
> Am 02.05.2017 um 05:43 schrieb Jeff Janes:
> 
> No.  You can certainly use PostgreSQL to store blobs.  But 
> then, you
> need to store the PostgreSQL data **someplace**.
> If you don't store it in S3, you have to store it somewhere 
> else.
> 
> I don't understand what you mean here. AFAIK storing blobs in PG 
> is not
> recommended since it is not very efficient.
> 
> Seems like several people here disagree with this conventional wisdom.
> 
> I think it depends very much on what level of "efficiency" you need. On
> my home server (i5 processor, 32GB RAM, Samsung 850 SSD - not a piece of
> junk, but not super powerful either) I can retrieve a small blob from a
> 100GB table in about 0.1 ms, and for large blobs the speed approaches
> 200MB/s. For just about everything I'd do on that server (or even at
> work) this is easily fast enough.
> 
> 
> S3 is often used for terabyte to petabyte file collections.   I would not want
> to burden my relational database with this.

I repeat the the first sentence I wrote: "I think it depends very much
on what level of 'efficiency' you need." Just because some people need
to store petabytes of blob data doesn't mean everybody does. If you need
to store petabytes of blobs, PostgreSQL may not be the right tool. But
it may be the right tool if you just need to store a few thousand PDFs.
To tell people to never store blobs in PostgreSQL because PostgreSQL is
"not efficient" is just bullshit. There are many factors which determine
how you should store your data, and "efficiency" (however that is
defined, if it's defined at all and not just used as a buzzword) is only
one of them - and rarely, in my experience, the most important one.

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html

signature.asc
Description: Digital signature

Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis, amqp, s3 in the future?

2017-05-05 Thread Peter J. Holzer

On 2017-05-04 23:08:25 +0200, Sven R. Kunze wrote:
> On 03.05.2017 12:57, Thomas Güttler wrote:
> >Am 02.05.2017 um 05:43 schrieb Jeff Janes:
> >>No.  You can certainly use PostgreSQL to store blobs.  But then, you
> >>need to store the PostgreSQL data **someplace**.
> >>If you don't store it in S3, you have to store it somewhere else.
> >
> >I don't understand what you mean here. AFAIK storing blobs in PG is not
> >recommended since it is not very efficient.
> 
> Seems like several people here disagree with this conventional wisdom.

I think it depends very much on what level of "efficiency" you need. On
my home server (i5 processor, 32GB RAM, Samsung 850 SSD - not a piece of
junk, but not super powerful either) I can retrieve a small blob from a
100GB table in about 0.1 ms, and for large blobs the speed approaches
200MB/s. For just about everything I'd do on that server (or even at
work) this is easily fast enough.

Sure, just telling the kernel "send data from file descriptor A (which
happens to be a file) to file descriptor B (a socket)" is a lot more
efficient than copying data from disk into a postgresql process, then
from that process to an application server, from that to the webserver
and that finally sends it to the socket. But if that just lets my server
be 99.9% idle instead of 99.0% idle, I haven't gained much. Similarly,
if my server spends 90% of it's resources doing other stuff, I won't
gain much by optimizing this (I should better optimize that other stuff
it's spending so much time on).

I am in this regard a firm believer in not optimizing prematurely. Do
whatever makes sense from an application point of view. If the blobs are
logically part of some other data (e.g. PDFs in a literature database),
store them together (either all of them in PostgreSQL, or all in some
NoSQL database, or maybe on stone tablets, if that makes sense for some
reason). Only if you have good reason[1] to believe that physically
separating data which logically belongs together will resolve a
bottleneck, then by all means separate them.

hp

[1] "I read somewhere on the internet" is usually not a good reason.

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html

signature.asc
Description: Digital signature

Re: [GENERAL] Handling psql lost connections

2017-03-30 Thread Peter J. Holzer

On 2017-03-29 08:49:57 -0700, Steve Crawford wrote:
> When firewalls/VPNs stand between my psql client and a remote PostgreSQL 
> server
> the connection will on occasion time out and drop. This results in the
> following scenario:
> 
> -Leave for lunch mid project - leave psql open.
> 
> -Return from lunch, complete and submit large query.
> 
> -Notice query is taking too long. cancel it.
> 
> -Cancel doesn't return - realize that connection has dropped.
> 
> -Kill psql - history is not written out. Start query from scratch.
> 
> Is there:
[...]
> Yes, I know I and my coworkers could spend brain cycles trying to unerringly
> remember to close and restart connections, write all queries in an external
> editor and then submit them, etc. but I'm looking for more user friendly
> options.

One workaround could be to login to the server, start a screen session
and psql in the screen session. Then if your network connection drops
you can simply login again and resume the screen session. Of course this
only works if you have a shell login on the server which may not be the
case.

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html


signature.asc
Description: Digital signature

Re: [GENERAL] inevitability of to_date() when converting representations which don't represent whole timestamps

2017-03-30 Thread Peter J. Holzer

I don't understand what you mean by "inevitability" in the subject.

On 2017-03-29 21:19:56 -0400, Shaun Cutts wrote:
> When being asked to convert a day of the week, the to_date() function
> returns the same day ('0001-01-01 BC’) no matter which day is
> converted:
> 
> # select to_date(‘Monday’, ‘Day’)
> '0001-01-01 BC’
> 
> # select to_date(‘Tuesday’, ‘Day’)
> '0001-01-01 BC’
> 
> However, if it were to return a date that was that day of the week, it
> could be inverted:
> 
> # select extract(dow from '0001-01-01 BC'::date); — this date should be the 
> result of to_date(‘Sunday’, ‘Day’)
> 6
> 
> # select extract(dow from '0001-01-02 BC'::date); — this date should be the 
> result of to_date(‘Monday’, ‘Day’)
> 0
> 
> ….
> 
> David tells this is not a bug, but it still seems like a reasonable
> requirement on to_date() to me. Is there some reason why this isn’t
> possible?

The documentation warns that to_date “interpret input liberally, with
minimal error checking” and while it “produce[s] valid output, the
conversion can yield unexpected results.”

I would agree that producing the same date for every day of the week
crosses the line between “unexpected (but valid) result” and “bug”.

On the other hand I have no idea what the result of to_date(‘Monday’,
‘Day’) should be. “Any date which is a Monday” seems too vague. “The
nearest Monday”, “the previous Monday”, “the next Monday” might be
useful in practice, but whichever of them you pick, you've picked the
wrong one with a probability of 2/3. “The first monday in the year -1 of
the proleptic Gregorian calendar” would be consistent with how
to_timestamp('12:34:56', 'HH24:MI:SS') works, but apart from that and
being invertible it seems to be a quite useless choice.

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html

signature.asc
Description: Digital signature

Re: [GENERAL] Postgres Permissions Article

2017-03-30 Thread Peter J. Holzer

On 2017-03-29 08:05:23 -0700, Paul Jungwirth wrote:
> On 03/29/2017 06:36 AM, Tom Lane wrote:
> >Karsten Hilbert <karsten.hilb...@gmx.net> writes:
> >>Being able to create foreign keys may allow to indirectly
> >>discover whether certain values exists in a table which I
> >>don't otherwise have access to (by means of failure or
> >>success to create a judiciously crafted FK).
> >
> >Aside from that, an FK can easily be used to cause effective
> >denial-of-service, for example preventing rows from being deleted
> >within a table, or adding enormous overhead to such a deletion.
> 
> Thank you both for taking a look! I agree those are both worthwhile
> concerns. It still seems a little strange it is not just part of the CREATE
> permission (for example). I understand why not everyone can create a foreign
> key, I just have trouble imagining a use case where it is helpful to
> separate it from other DDL commands.

A foreign key affects not only the table on which it is defined but also
the table it references. 

If Alice creates a table “master” and Bob creates a table “detail”
referencing “master”, Bob can prevent Alice from deleting entries from
her own table. So Alice must be able to decide whom she allows to
reference her tables.

I don't see how how this could be part of the create privilege - I
certainly want different roles to be able to create their own tables (or
views, or whatever) without being able to DOS each other (accidentally
or intentionally).

(Also I don't understand why you wrote “You need the permission on both
tables”: Only the owner of a table can add constraints to it - this
privilege cannot be granted to other roles at all. So to create a
foreign key constraint you need to be the owner of the referencing table
and have the references privilege on the referenced table. It's not
symmetrical.)

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html

signature.asc
Description: Digital signature

Re: [GENERAL] Autoanalyze oddity

2017-03-23 Thread Peter J. Holzer

On 2017-03-05 12:01:07 +0100, Peter J. Holzer wrote:
[...]
> At the current rate of inserts, this threshold will be reached on
> March 24nd. I'll check whether the table is analyzed then.

It was (a little earlier than expected because pg_class.reltuples didn't
increase in the meantime).

hp


-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html


signature.asc
Description: Digital signature

Re: [GENERAL] Autoanalyze oddity

2017-03-05 Thread Peter J. Holzer

On 2017-03-05 08:39:05 -0800, Adrian Klaver wrote:
> On 03/05/2017 03:01 AM, Peter J. Holzer wrote:
> >So it is likely that something happened on that day (disk full?) which
> >wiped out the contents of pg_stat_user_tables.
> 
> Are there any logs from that time, either Postgres or system?
> 
> I would think a full disk would have been noticed at the time so alternate
> theories:
> 
> https://www.postgresql.org/docs/9.5/static/monitoring-stats.html
> 
> "...  When the server shuts down cleanly, a permanent copy of the statistics
> data is stored in the pg_stat subdirectory, so that statistics can be
> retained across server restarts. When recovery is performed at server start
> (e.g. after immediate shutdown, server crash, and point-in-time recovery),
> all statistics counters are reset.
> ..."

Oh, of course. That was the day we found out the hard way that the
bypass for the UPS didn't work. I knew that date looked familiar, but
somehow couldn't place it. Mystery solved, thanks!

hp


-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html


signature.asc
Description: Digital signature

Re: [GENERAL] Autoanalyze oddity

2017-03-05 Thread Peter J. Holzer

On 2017-03-03 06:39:35 -0800, Adrian Klaver wrote:
> On 03/03/2017 12:33 AM, Peter J. Holzer wrote:
> >This is with PostgreSQL 9.5.6 on Debian Linux.
> >
> >I noticed that according to pg_stat_user_tables autoanalyze has never
> >run on a lot of tables. Here is one example:
> >
> >wdsah=> select * from pg_stat_user_tables where schemaname='public' and 
> >relname='facttable_wds_indexstats';
> >─[ RECORD 1 ]───┬─
[...]
> >n_tup_ins   │ 47128
[...]
> >n_live_tup  │ 47128
> >n_dead_tup  │ 0
> >n_mod_since_analyze │ 47128
> >last_vacuum │ (∅)
> >last_autovacuum │ (∅)
> >last_analyze│ (∅)
> >last_autoanalyze│ (∅)
> >vacuum_count│ 0
> >autovacuum_count│ 0
> >analyze_count   │ 0
> >autoanalyze_count   │ 0
> >
> >wdsah=> select count(*) from facttable_wds_indexstats;
> > count
> >
> > 857992
> >(1 row)
> >
> >So, n_live_tup is way off, and n_tup_ins and n_mod_since_analyze also
> >seem to be wrong. Looks like this hasn't been updated in a year or so.
> >But track_counts is on:
> >
> >wdsah=> show track_counts;
> > track_counts
> >──
> > on
> >(1 row)
> 
> What are your settings for autovacuum?:
> 
> https://www.postgresql.org/docs/9.5/static/runtime-config-autovacuum.html

All the values in the autovacuum section of postgresql.conf are
commented out, so they should be the default values:

Just to be sure here's the output of show for each of the parameters:

wdsah=> show autovacuum;  on
wdsah=> show log_autovacuum_min_duration; -1
wdsah=> show autovacuum_max_workers;  3
wdsah=> show autovacuum_naptime;  1min
wdsah=> show autovacuum_vacuum_threshold; 50
wdsah=> show autovacuum_analyze_threshold;50
wdsah=> show autovacuum_vacuum_scale_factor;  0.2
wdsah=> show autovacuum_analyze_scale_factor; 0.1
wdsah=> show autovacuum_freeze_max_age;   2
wdsah=> show autovacuum_multixact_freeze_max_age; 4
wdsah=> show autovacuum_vacuum_cost_delay;20ms
wdsah=> show autovacuum_vacuum_cost_limit;-1


> Have the storage parameters for the table been altered?:
> 
> https://www.postgresql.org/docs/9.5/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS

No.

> >And even if it wasn't, shouldn't the autovacuum daemon notice that
> >n_mod_since_analyze is greater than n_live_tup *
> >autovacuum_analyze_scale_factor and run an autoanalyze?
> 
> That value is added to autovacuum_analyze_threshold:
> 
> autovacuum_analyze_scale_factor (floating point)
> 
> Specifies a fraction of the table size to add to
> autovacuum_analyze_threshold when deciding whether to trigger an ANALYZE.
> The default is 0.1 (10% of table size). This parameter can only be set in
> the postgresql.conf file or on the server command line; but the setting can
> be overridden for individual tables by changing table storage parameters.

True. But 50 is negligible compared to 47128*0.1. So that shouldn't make
much of a difference.

But now that I look closer, I notice that the number in n_tup_ins for
that table is exactly the number of records inserted since
2017-02-08T13:00 and there were no records inserted between 09:00 and
13:00 on that day.

So it is likely that something happened on that day (disk full?) which
wiped out the contents of pg_stat_user_tables.

Looking into the source code, I find that 
reltuples = classForm->reltuples;
Am I correct to assume that this is pg_class.reltuples? That would
explain why analyze hasn't run yet: This is 862378, which is exactly
correct. 862378 * 0.1 + 50 is 86287.8, which is larger than
pg_stat_user_tables.n_mod_since_analyze. At the current rate of inserts,
this threshold will be reached on March 24nd. I'll check whether the
table is analyzed then.

hp


-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html


signature.asc
Description: Digital signature

[GENERAL] Autoanalyze oddity

2017-03-03 Thread Peter J. Holzer

This is with PostgreSQL 9.5.6 on Debian Linux.

I noticed that according to pg_stat_user_tables autoanalyze has never
run on a lot of tables. Here is one example:

wdsah=> select * from pg_stat_user_tables where schemaname='public' and 
relname='facttable_wds_indexstats';
─[ RECORD 1 ]───┬─
relid   │ 112723
schemaname  │ public
relname │ facttable_wds_indexstats
seq_scan│ 569
seq_tup_read│ 474779212
idx_scan│ 59184
idx_tup_fetch   │ 59184
n_tup_ins   │ 47128
n_tup_upd   │ 0
n_tup_del   │ 0
n_tup_hot_upd   │ 0
n_live_tup  │ 47128
n_dead_tup  │ 0
n_mod_since_analyze │ 47128
last_vacuum │ (∅)
last_autovacuum │ (∅)
last_analyze│ (∅)
last_autoanalyze│ (∅)
vacuum_count│ 0
autovacuum_count│ 0
analyze_count   │ 0
autoanalyze_count   │ 0

wdsah=> select count(*) from facttable_wds_indexstats;
 count  

 857992
(1 row)

So, n_live_tup is way off, and n_tup_ins and n_mod_since_analyze also
seem to be wrong. Looks like this hasn't been updated in a year or so.
But track_counts is on:

wdsah=> show track_counts;
 track_counts 
──
 on
(1 row)

And even if it wasn't, shouldn't the autovacuum daemon notice that
n_mod_since_analyze is greater than n_live_tup *
autovacuum_analyze_scale_factor and run an autoanalyze?

But the really weird thing is that pg_stats seems to be reasonably
current: I see entries in most_common_vals which were only inserted in
January. Is it possible that autoanalyze runs without updating
pg_stat_user_tables?

hp


-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html


signature.asc
Description: Digital signature

Re: [GENERAL] Loose indexscan and partial indexes

2017-02-10 Thread Peter J. Holzer

On 2017-02-10 14:24:36 +0100, Thomas Kellerer wrote:
> Peter J. Holzer schrieb am 10.02.2017 um 14:02:
> > So it's doing a sequential scan on the initial select in the recursive
> > CTE, but using the index on the subsequent selects.
> > 
> > But why? If it uses the index on
> > SELECT MIN(periodizitaet) FROM facttable_imf_ifs WHERE periodizitaet > 
> > 'x'
> > shouldn't it be able to use the same index on 
> > SELECT MIN(periodizitaet) FROM facttable_imf_ifs
> 
> What is the definition of the index facttable_imf_ifs_periodizitaet_idx?

The solution to the puzzle was just 2 paragraphs further down. 

Looks like I have to practice this arc of suspense thing ;-)

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html


signature.asc
Description: Digital signature

[GENERAL] Loose indexscan and partial indexes

2017-02-10 Thread Peter J. Holzer

.51 rows=1 width=2) (actual 
time=0.039..0.040 rows=1 loops=1)
 ->  Index Only Scan using 
facttable_imf_ifs_periodizitaet_idx on facttable_imf_ifs  
(cost=0.44..1516760.47 rows=21080284 width=2) (actual time=0.038..0.038 rows=1 
loops=1)
   Index Cond: ((periodizitaet IS NOT NULL) AND 
(periodizitaet IS NOT NULL))
   Heap Fetches: 1
   ->  WorkTable Scan on t t_1  (cost=0.00..6.19 rows=10 width=32) 
(actual time=0.161..0.162 rows=1 loops=4)
 Filter: (periodizitaet IS NOT NULL)
 Rows Removed by Filter: 0
 SubPlan 3
   ->  Result  (cost=0.59..0.60 rows=1 width=0) (actual 
time=0.212..0.212 rows=1 loops=3)
 InitPlan 2 (returns $3)
   ->  Limit  (cost=0.44..0.59 rows=1 width=2) (actual 
time=0.211..0.211 rows=1 loops=3)
 ->  Index Only Scan using 
facttable_imf_ifs_periodizitaet_idx on facttable_imf_ifs facttable_imf_ifs_1  
(cost=0.44..1061729.65 rows=7026761 width=2) (actual time=0.208..0.208 rows=1 
loops=3)
   Index Cond: ((periodizitaet IS NOT NULL) 
AND (periodizitaet > t_1.periodizitaet))
   Heap Fetches: 2
 Planning time: 8.883 ms
 Execution time: 0.801 ms
(23 rows)

800 times faster :-).

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html


signature.asc
Description: Digital signature

[GENERAL] Streaming replication protocol

2017-01-13 Thread Christopher J. Bottaro

Hello,

I'm trying to write a program that speaks the streaming replication
protocol (for logical decoding).  I get to the part where I issue a query:

START_REPLICATION SLOT regression_slot LOGICAL 0/0;

And after that, I receive an empty copy_both_response then a copy_data that
has a "Primary keepalive message" in it.  Then I'm stuck; I don't know how
to respond to that.

I'm going off the documentation here:

https://www.postgresql.org/docs/9.5/static/protocol-replication.html

Which I find really confusing and hard to understand.  Any help would be
appreciated!  Thank you.

Re: [GENERAL] Are new connection/security features in order, given connection pooling?

2017-01-13 Thread Peter J. Holzer

On 2017-01-11 00:49:19 -0800, Guyren Howe wrote:
> I’m not following. What I would like is just a lightweight way to switch the
> connections to use a different role, or some moral equivalent, that would
> prevent an SQL injection from wrecking havoc. I’m not proposing anything that
> will change anything else about how the application is using the database.
> 
> SET ROLE doesn’t work, because an SQL injection can just SET ROLE back to the
> privileged user.

But then you are no worse off than with the commonly used scheme of
executing all queries as the same (necessarily "privileged") user.

In both cases the attacker can execute queries as a privileged user IF
he succeeds at sql injections.

But as others have already noted this is relatively easy to prevent.
Just preparing all queries is sufficient, even if you don't actually
parametrize them. Perl DBI does this, so this dangerous-looking line of
code (assume that the string wasn't hardcoded but the result of an SQL
injection):

$r = $dbh->selectall_arrayref("select * from twoqueries; insert into 
twoqueries(t) values('b')");

will fail with

DBD::Pg::db selectall_arrayref failed: ERROR:  cannot insert
multiple commands into a prepared statement at ./twoqueries line 21.

So I think just using 

set local role 

at the beginning of each transaction should work well with session
pooling. It doesn't protect you against sql injections, but you won't
have to reinvent the authorization system.

> I would like a regime where there is no permanent privileged relationship
> between the client application and the database; a user would need to supply
> validating information that we can use to switch a connection to something 
> with
> minimal privileges for just that role, for the duration of one session or
> transaction.

I haven't read the blog post referenced in this thread yet, so maybe
this is covered there, but I think "supplying validating information"
would be the hard part. In general you wouldn't want a web-frontend to
cache plain-text passwords to resubmit them for each transaction, but to
use something more ethereal, like session cookies or kerberos tickets.

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html

signature.asc
Description: Digital signature

Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Peter J. Holzer

On 2017-01-04 06:53:31 -0800, Adrian Klaver wrote:
> On 01/04/2017 05:00 AM, vod vos wrote:
> >Now I am confused about I can create 1100 columns in a table in
> >postgresql, but I can't copy 1100 values into the table. And I really
> 
> As pointed out previously:
> 
> https://www.postgresql.org/about/
> Maximum Columns per Table 250 - 1600 depending on column types
> 
> That being dependent on both the number of columns and the actual data in
> the columns.

I think this is confusingly phrased. In my mind "column type" is static
- the type is the same, independent of the values which are stored. So
  "250 - 1600 depending on column types" implies to me that there is
some type A of which I can have only 250 columns and another type B of
which I can have 1600 columns. But it doesn't imply to me that the
number of columns depends on the values which ar put into those columns.

May I suggest the these improvements?

In https://www.postgresql.org/about/:
Instead of
| 250 - 1600 depending on column types
write
| 250 - 1600 depending on column types and data

In https://www.postgresql.org/docs/9.6/static/ddl-basics.html:
Replace the sentence:
| Depending on the column types, it is between 250 and 1600.
with:
| For all columns in a row, some information (either the data itself or
| a pointer to the data) must be stored in a single block (8 kB).
| Because for some types this data is itself of variable length, the
| maximum number of columns depends not only on the types of the columns
| but also on the data (e.g., a NULL uses less space than a non-NULL
| value). Therefore there is no simple way to compute the maximum number
| of columns, and it is possible to declare a table with more columns
| than can be filled. Keeping all this in mind, the limit is between 250
| and 1600.

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html

signature.asc
Description: Digital signature

Re: [GENERAL] explain analyze showed improved results without changes, why?

2016-12-25 Thread Peter J. Holzer

On 2016-12-23 10:35:26 -0600, Chris Richards wrote:
> Howdy. I was noticing a significant problem with a query on one of my tables. 
> I
> tried recreating the problem and wasn't able to do so on a different install,
> and so a few days later I ran the same query on the problem table. Lo' and
> behold, there wasn't a problem anymore. I'm at a loss to why.
[...]
>     "blocks_off_sz_idx" btree (off, sz)
> 
> mdb=> explain analyze SELECT * FROM blocks
>  WHERE cloudidx=98038 AND off+sz >= 0 AND state='seeded'
>  ORDER BY off LIMIT 1 FOR UPDATE;
>                                                                       QUERY
> PLAN
> ---
>  Limit  (cost=0.43..587.83 rows=1 width=100) (actual time=4814.579..4814.579 
> rows=1 loops=1)
>    ->  LockRows  (cost=0.43..1358633.99 rows=2313 width=100) (actual time= 
> 4814.577..4814.577 rows=1 loops=1)
>          ->  Index Scan using blocks_off_sz_idx on blocks  (cost= 
> 0.43..1358610.86 rows=2313 width=100) (actual time=4813.498..4814.384 rows=2 
> loops=1)
>                Filter: ((cloudidx = 98038) AND (state = 
> 'seeded'::block_state) AND ((off + sz) >= 0))
>                Rows Removed by Filter: 6935023
>  Total runtime: 4814.619 ms
> (6 rows)

This scans the table in ascending (off, sz) order until it finds one row
matching the filter. Apparently at the time of the query there were
6935023 rows in the table before the matching row.

[...]
> And here's the second. Notice that even though there are more rows, it was 
> much
> faster and the "rows removed by filter" were significantly reduced by several
> orders of magnitude.
> 
> 
> mdb=> explain analyze SELECT * FROM blocks
>  WHERE cloudidx=98038 AND off+sz >= 0 AND state='seeded'
>  ORDER BY off LIMIT 1 FOR UPDATE;
>                                                                    QUERY PLAN
> -
>  Limit  (cost=0.43..584.57 rows=1 width=100) (actual time=0.071..0.071 rows=1 
> loops=1)
>    ->  LockRows  (cost=0.43..1390825.21 rows=2381 width=100) (actual 
> time=0.070..0.070 rows=1 loops=1)
>          ->  Index Scan using blocks_off_sz_idx on blocks  
> (cost=0.43..1390801.40 rows=2381 width=100) (actual time=0.055..0.055 rows=1 
> loops=1)
>                Filter: ((cloudidx = 98038) AND (state = 
> 'seeded'::block_state) AND ((off + sz) >= 0))
>                Rows Removed by Filter: 26
>  Total runtime: 0.114 ms
> (6 rows)

The plan here is exactly the same, but only 26 rows are discarded. My
guess is that between those two queries a row was inserted with a really
low (off, sz) value which matches the query. So now the query can return
after checking only a handful of rows.

LIMIT, EXISTS, etc. are awful when you want predictable performance. You
may be lucky and the rows you are looking for are just at the start or
you may be unlucky and you have to scan through the whole table to find
them. The optimizer (usually) doesn't have enough information and
assumes they are spread randomly through the table.

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html


signature.asc
Description: Digital signature

Re: [GENERAL] Is is safe to use SPI in multiple threads?

2016-12-23 Thread Peter J. Holzer

On 2016-12-09 16:52:05 +0800, Qiu Xiafei wrote:
> I'm new to PG and want to implement my domain-specific system based on PG. I
> wish to arrange my data as several tables in database and translate my DSL 
> into
> SQL statements for query. Since one DSL statement may be mapped to several SQL
> statements, it's better to push the DSL server as close to the PG server as
> possible. I found PG's backgroud worker meet my needs. I can setup a 
> background
> worker bounded to PG server and listen to a port for network requests. 
> 
> But I encounter a problem that the Server Programing Interfaces are not THREAD
> SAFE. There are some global variables defined like: SPI_processed,
> SPI_tuptable, etc. This limit to my DSL server to work in single thread mode
> which is quite inefficient.

I had a similar requirement. I solved it by moving the application logic
out of the stored procedures. All the stored procedure does is an RPC
call (I use ØMQ for that) to a server process and send the result back
to the client. The server process converts the request into multiple SQL
queries which can be processed in parallel.

The downside is of course that the communication overhead is much
higher (A minimum of 4 network messages per request). That's not a
problem in my case, but you mileage may vary.

The advantages in my opinion are:

* A standalone server process is easier to test and debug than a bunch
  of stored procedures.
* I can easily scale out if necessary: Currently my database and server
  process run on the same machine, but I could distribute them over
  several machines with (almost) no change in logic.

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html

signature.asc
Description: Digital signature

Re: [GENERAL] Index size

2016-12-10 Thread Peter J. Holzer

On 2016-12-09 21:45:35 -0500, Melvin Davidson wrote:
> On Fri, Dec 9, 2016 at 6:40 PM, Samuel Williams 
> <space.ship.travel...@gmail.com>
> wrote:
> >I also read that when you change a column which is not index, all the
> >indexes for that row need to be updated anyway. Is that correct?
> 
> That is not correct. Indexes are changed under the following conditions:
> A. An insert is done to the table which involves an index.
> B. A delete is done to the table which involves an index.
> C. An update is done that involves columns included in an index.
> D. An index is REINDEXed
> 
> Indexes point to the tid of the row for which the column(s) in the index
> are involved. So if columns updated are not involved in the index,
> there is no need to change the index.

I don't think this is generally correct. The TID is a (block,item)
tuple. It the updated version of the row doesn't fit into the same block
it has to be stored in a different block, so the TID will change (AIUI
there is a bit of trickery to avoid changing the TID if the new version
is stored in the same block). This means that all the index entries for
this row (not just for the changed field) will have to be updated. You
can set fillfactor to a smaller value to make this less likely.

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html

signature.asc
Description: Digital signature

Re: [GENERAL] pg_dump fundenental question

2016-07-05 Thread J. Cassidy




Thank you all for the information.

On 07/05/2016 10:10 AM, J. Cassidy wrote:
> Hello Adrian,
>
> appreciate the prompt reply, thank you.
>
> As stated in
the original email, I want to know whether compression
> (whatever level)
is on by default (or not) - if I supply NO extra
> switches/options.

There is no compression by default.


--
Command
Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL
Centered full stack support, consulting and development.
Everyone appreciates
your honesty, until you are honest with them.



Thank you all for the information.On 07/05/2016 10:10 AM, J. Cassidy
wrote:> Hello Adrian,>> appreciate the prompt reply,
thank you.>> As stated in the original email, I want to know
whether compression> (whatever level) is on by default (or not) - if I
supply NO extra> switches/options.There is no compression by
default.--Command Prompt, Inc.
http://the.postgres.company/+1-503-667-4564PostgreSQL Centered full
stack support, consulting and development.Everyone appreciates your honesty,
until you are honest with them. 
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_dump fundenental question

2016-07-05 Thread J. Cassidy

Francisco,

appreciate the tips/hints.

My input (source) DB
is  1TB in size, using the options as stated in my original email (i.e. no
compression it would seem) the output file size is "only"
324GB. 
I presume all of the formatting/indices have been ommited. As I
said before, I can browse the backup file with less/heat/cat/tail etc.

Regards,

Henk

On Tue, Jul 5, 2016 at 4:54 PM,
J. Cassidy <s...@jdcassidy.eu> wrote:
> I have hopefully an
"easy" question.
> If I issue the pg_dump command with no
switches or options i.e.
> /usr/local/pgsql/bin/pg_dump -v dbabcd >
/data3TB/Backup/dbabcd
> Is their any "default" compression
involved or not? Does pg_dump talk to
> zlib during the dump process given
that I have not specified any compression
> on the > command line? (see
above).

IIRC no options means you are requesting an SQL-script. Those
are not
compressed, just pipe them through your favorite compressor. ( In
a
later message you stated you were in Linux and had a 324Gb file, and
could head/tail it, have you done so? ).

> Your considered
opinions would be much appreciated.

OTOH, with those big backup I would
recommend using custom format (
-Fc ), its much more flexible ( andyou
can have the sql script but
asking pg_restore to generate it if you need it,
but not the other way
round ).

Francisco Olarte.

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_dump fundenental question

2016-07-05 Thread J. Cassidy



Rob,


appreciate the reply but I have never used nor never will
use "that" os (small capitals intentional.


Regards,


Henk

Re: [GENERAL] pg_dump fundenental question

2016-07-05 Thread J. Cassidy



Hello David,

a good point, but I would prefer NOT to open a 324GB
backup file in a text editor. I can however cat/less/head/tail the file in
Linux.

Regards,


Henk

Re: [GENERAL] pg_dump fundenental question

2016-07-05 Thread J. Cassidy

Hello Adrian,

appreciate the prompt reply, thank you.

As
stated in the original email, I want to know whether compression (whatever 
level)
is on by default (or not) -  if I supply NO extra switches/options.
I
have read the documentation and it is unclear in this respect. I am a Mainframer
and perhaps have a different world view on how to explain things...

TIA
and regards,

Henk.

On 07/05/2016 07:54 AM,
J. Cassidy wrote:
> Hello all,
>
> I have hopefully an
"easy" question.
>
> If I issue the pg_dump command with
no switches or options i.e.
>
> /usr/local/pgsql/bin/pg_dump -v
dbabcd > /data3TB/Backup/dbabcd
>
> Is their any
"default" compression involved or not? Does pg_dump talk to
>
zlib during the dump process given that I have not specified any
>
compression on the
> command line? (see above).
>
> Your
considered opinions would be much appreciated.

https://www.postgresql.org/docs/9.5/static/app-pgdump.html

"
-F format
--format=format

Selects the format of the output.
format can be one of the following:

p
plain

Output a
plain-text SQL script file (the default).
"

..

In line tag:

"-Z 0..9
--compress=0..9

Specify
the compression level to use. Zero means no compression.
For the custom
archive format, this specifies compression of individual
table-data segments,
and the default is to compress at a moderate level.
<* SEE HERE For plain
text output, setting a nonzero compression level
causes the entire output
file to be compressed, as though it had been
fed through gzip; but the
default is not to compress. SEE HERE *> The
tar archive format currently
does not support compression at all.
"

>
>
> Regards,
>
>
> Henk
>

--
Adrian Klaver
adrian.kla...@aklaver.com

J

[GENERAL] pg_dump fundenental question

2016-07-05 Thread J. Cassidy



Hello all,

I have hopefully an "easy" question.

If
I issue the pg_dump command with no switches or options i.e.

/usr/local/pgsql/bin/pg_dump -v dbabcd > /data3TB/Backup/dbabcd

Is
their any "default" compression involved or not? Does pg_dump talk to
zlib during the dump process given that I have not specified any compression on
the 
command line? (see above).

Your considered opinions
would be much appreciated.


Regards,


Henk

Re: [GENERAL] Ascii Elephant for text based protocols

2016-05-15 Thread Peter J. Holzer

On 2016-05-15 14:02:56 +0200, Charles Clavadetscher wrote:
> ++
> |     __  ___|
> |  /)/  \/   \   |
> | ( / ___\)  |
> |  \(/ o)  ( o)   )  |
> |   \_  (_  )   \ )  /   |
> | \  /\_/\)_/|
> |  \/  //|  |\\  |
> |  v |  | v  |
> |\__/|
> ||
> |  PostgreSQL 1996-2016  |
> |  20 Years of success   |
> +----+

Nice.

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature

Re: [GENERAL] Using both ident and password in pg_hba.conf

2016-05-14 Thread Peter J. Holzer

On 2016-05-09 16:18:39 -0400, D'Arcy J.M. Cain wrote:
> On Mon, 9 May 2016 13:02:53 -0700
> Adrian Klaver <adrian.kla...@aklaver.com> wrote:
> > So define PHP runs as 'nobody'?
> 
> Because of the way PHP and Apache works PHP script have to run as the
> Apache user which, in my case anyway, is "nobody" so every PHP script
> runs as nobody.

This is not really true. You can use FastCGI to run PHP for each site as
a different user. For Apache there is also an MPM
(http://mpm-itk.sesse.net/) which lets you run apache processes (and
therefore also any embedded mod_php) under different uids. So while
running everything as nobody is the default, it is possible to use
different users, and I would strongly recommend doing this if you have
multiple customers.

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

Re: [GENERAL] Allow disabling folding of unquoted identifiers to lowercase

2016-05-08 Thread Peter J. Holzer

On 2016-04-29 19:21:30 +0200, Evgeny Morozov wrote:
> It would be great if Postgres had a server setting that allowed the automatic
> folding of identifiers to lowercase to be disabled, so that camel case
> identifiers could be used without having to quote every single identifier, 
> i.e.
> 
> SELECT MyColumn FROM MyTable ORDER BY MyColumn
> 
> instead of
> 
> SELECT "MyColumn" FROM "MyTable" ORDER BY "MyColumn"
[...]
> My company is looking into doing this. Currently our table and column names
> exactly match our class and property names, which are in camel case. MSSQL
> supports this just fine. To move to Postgres we would have to either quote
> *everything* or translate names back-and-forth between code and database. Both
> options are OK for auto-generated SQL, but we also have many users writing
> ad-hoc SQL queries. Having to quote everything would have those users 
> screaming
> to switch back to MSSQL very quickly! That leaves us with the mapping 
> approach,
> which is doable, but also a constant "mental speedbump" at best.

What exactly is the problem you are trying to solve? 

If you and your users are consistent about never using quotes, your
users can write:

SELECT MyColumn FROM MyTable ORDER BY MyColumn;

It will select mycolumn from mytable, but that doesn't matter, since you
created the table with

CREATE MyTable (MyColumn varchar);

so you really have a table mytable with a column mycolumn, not a table
MyTable with a column MyColumn.

There are three potential problems I can see:

1) Users might be confused that PgAdmin (or whatever tool they use to
   inspect the database) displays all the names in lowercase, and they
   might find a name like sometablewithareallylongname less readable
   than SomeTableWithAReallyLongName.

2) Since case doesn't matter, they might be inconsistent: One programmer
   might write MyTable, another mytable, the third MYTABLE, ...

3) You might want to use a tool to automatically generate SQL queries,
   but that tool quotes identifiers.

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

Re: [GENERAL] Is it possible to recover the schema from the raw files?

2016-04-24 Thread Tomas J Stehlik

Hello Adrian,

> Then I am of no further use to this conversation.

No problem at all.

Thank you for your well considered input and ideas.

Have a lovely day.

Kindest regards,

Tomas



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Distributed Table Partitioning

2016-03-13 Thread Peter J. Holzer

On 2016-03-12 21:19:11 -0500, Melvin Davidson wrote:
> - Original Message -
> From: "Leonardo M. Ramé" <l.r...@griensu.com>
> To: "PostgreSql-general" <pgsql-general@postgresql.org>
> Sent: Saturday, 12 March, 2016 8:25:01 PM
> Subject: [GENERAL] Distributed Table Partitioning
> 
> I have this problem: a Master table containing records with a timestamp
> column registering creation date-time, and one Detail table containing
> info related to the Master table.
> 
> As time went by, those tables grew enormously, and I can't afford
> expanding my SSD VPS. So I'm thinking about storing only NEW data into
> it, and move OLD data to a cheaper SATA VPS.
[...]
> Why don't you just make use of tablespaces and partition the child tablespaces
> so that the newer parttion is on the SSD and the older one is on SATA?

Since he mentioned virtual private servers (VPS) the reason might be
that his hoster offers VPS with SSDs (of various sizes) and VPS with
rotating hard disks (of various sizes), but not VPS with both. So he
can't rent a VPS with a relatively small SSD and a larger hard disk. 

That might be a reason to look for an alternate hoster, but if he's
otherwise happy, switching to an unknown provider might be considered
too large a risk.

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature

Re: [GENERAL] Unable to match same value in field.

2016-03-12 Thread Peter J. Holzer

On 2016-03-10 11:09:00 +0200, Condor wrote:
> I using postgresql 9.5.1 and I have problem to match value in one field.
> Both tables are text:
[...]
> =# select imsi from list_cards_tbl where imsi = '28411123315';
>  imsi
> --
> (0 rows)
> 
> No value, lets change to LIKE
> 
> =# select imsi, md5(imsi), bit_length(imsi) from list_cards_tbl where imsi
> like '28411123315%';
>   imsi   |   md5| bit_length
> -+--+
>  28411123315 | b438e984c97483bb942eaaed5c0147f3 |120
> (1 row)

That looks familiar. I think I've seen something similar recently. That
was on 9.5beta1 (I haven't gotten around to upgrade to 9.5.1 yet).

> =# reindex table list_cards_tbl;
> REINDEX
[...]
> Still cant find value.

Dropping and recreating the index helped in my case. Still, I find it
worrying if a value which obviously is in the table can't be found using
the index.

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature

Re: [GENERAL] index problems (again)

2016-03-12 Thread Peter J. Holzer

On 2016-03-12 21:00:04 +, Geoff Winkless wrote:
> On 12 March 2016 at 18:43, Peter J. Holzer <hjp-pg...@hjp.at> wrote:
> > The question is what can be done to improve the situation.
> >
> > Tom thinks that correlation statistics would help. That seems plausible
> > to me.
[...]
> > You claim that no statistics are needed.
> 
> Well that's a bit confrontational.

Sorry. Didn't want to sound confrontational. I was just repeating points
made by Tom and you previously in this thread to establish a baseline.

> > That may or may not be true: You haven't proposed an alternate method
> > yet.
> 
> You could make an assumption that perfect distribution isn't true:
> that actually the distribution is within a certain _deviation_ of that
> perfect distribution. It wouldn't have to have been very much to make
> the index-only scan win here and would still keep the planner from
> choosing less optimal queries most of the time (and where it did end
> up making the "wrong" choice it's not going to be far off anyway).
> 
> But I'm making assumptions here, I'm aware of that. Chances are that
> actually most people's data _does_ fit into this perfect distribution
> set. Is there any research that shows that real-world data usually
> does?

I don't think most people's data is perfectly distributed. But as you
say most data is probably within some deviation of being perfectly
distributed and as long as that deviation isn't too big it doesn't
matter.

But there are certainly some common examples of highly correlated
columns. Having a serial id and a date as in your case is probably quite
common. Another example might be a surrogate primary key which is
computed from some other fields (e.g. a timeseries code starting with a
country code, or a social security number starting with the birth date,
...). That's probably not that uncommon either. 

So, I agree with you. This is a problem and it should be fixed. I'm just
sceptical that it can be done with a simple cost adjustment.

> As Jeff points out I'd have a much larger win in this instance by
> someone spending the time implementing skip index scans rather than
> messing with the planner :)

Yeah. I think I have some code which could benefit from this, too. I'll
have to try that trick from the wiki.

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

Re: [GENERAL] index problems (again)

2016-03-12 Thread Peter J. Holzer

On 2016-03-08 10:16:57 +, Geoff Winkless wrote:
> On 7 March 2016 at 20:40, Peter J. Holzer <hjp-pg...@hjp.at> wrote:
> > As Tom wrote, the estimate of having to read only about 140 rows is only
> > valid if sc_id and sc_date are uncorrelated. In reality your query has
> > to read a lot more than 140 rows, so it is much slower.
> 
> But as I've said previously, even if I do select from scdate values
> that I know to be in the first 1% of the data (supposedly the perfect
> condition) the scan method is insignificantly quicker than the index
> (scdate,scid) method.

Actually the planner expects find a match within the first 0.0035 %, so
to find out how fast that would be you would have to use a value from
that range.

> Even with the absolute perfect storm (loading in the entire index for
> the full range) it's still not too bad (1.3 seconds or so).
> 
> The point is that to assume, knowing nothing about the data, that the
> data is in an even distribution is only a valid strategy if the worst
> case (when that assumption turns out to be wildly incorrect) is not
> catastrophic. That's not the case here.

True. The fundamental problem here is that the planner doesn't have any
notion of a worst case. It only knows "cost", and that is a single
number for each operation. For many operations, both the best case and
the worst case are unusable as cost - the first would almost always
underestimate the time and choose a plan which is far from optimal and
the second would almost always overestimate it and reject an optimal
plan. The art of programming a planner (which I've dabbled with in a
previous (not postgresql-related) project but certainly can't claim any
expertise in) lies in choosing a cost function which is quite close most
of the time and catastrophically wrong only very rarely. It is clear
that PostgreSQL hasn't succeed in the latter category: Correlated
columns do occur and the current cost function, which assumes that all
columns are uncorrelated can catastrophically underestimate the cost in
this case. 

The question is what can be done to improve the situation.

Tom thinks that correlation statistics would help. That seems plausible
to me.

You claim that no statistics are needed.

That may or may not be true: You haven't proposed an alternate method
yet.

I feel fairly certain that using the worst case (the cost for scanning
the whole table) would be just as bad in and would cause inferior plans
to be used in many instances.

Maybe computing the cost as weighted average of the best, average and
worst case (e.g. cost = cost_best*0.05 + cost_avg*0.90 + cost_worst*0.05)
would penalize methods with a large spread between best and worst case
enough - but that still leaves the problem of determining the weights
and determining what the "average" is. So it's the same black magic as
now, just the little more complicated (on the plus side, this would
probably be a relatively simple patch).

If we assume that we could revamp the planner completely, other
possibilities come to mind:

For example, since I think that the core problem is having a single
number for the cost, the planner could instead compute a distribution
(in the most simple case just best and worst case, but ideally many
values). Then the planner could say something like: I have two plans A
nd B and A is at most 20 % faster in almost all cases. But in the worst
case, A is 1000 times slower. Being 20 % faster most of the time is nice
but doesn't outweigh the risk of being 1000 times slower sometimes, so
I'll use B anyway. 

Another possibility I've been considering for some time is feeding back
the real execution times into the planner, but that sounds like a major
research project. (Actually I think Oracle does something like this
since version 12)

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

Re: [GENERAL] index problems (again)

2016-03-07 Thread Peter J. Holzer

On 2016-03-07 16:37:37 +, Geoff Winkless wrote:
> On 7 March 2016 at 16:02, Tom Lane <t...@sss.pgh.pa.us> wrote:
> > In English, what that plan is trying to do is scan the index
> > in sc_id order until it hits a row with scdate in the target range.
> > The first such row, by definition, has the correct min(sc_id) value.
> > The problem is that we're guessing at how soon we'll hit such a row.
> > If the columns are independent, then the planner can guess based on how
> > many rows in the whole table have scdate in the target range, and it
> > will probably be about right.  But that estimate can fall down very
> > badly if sc_id and scdate increase together, because then the target
> > rows aren't randomly distributed in the index sequence but could all be
> > all the way at the far end of the index.
> 
> I'm sorry, I'm obviously not being clear. I already accepted this
> argument when Victor gave it, although I believe that in part it falls
> down because sc_id is also (potentially) randomly distributed so it's
> not like you're doing a sequential table scan (it might work better on
> a clustered table, but we don't have those :) )
> 
> So you still have an extra layer of indirection into a large table
> with lots of random accesses.
> 
> > If we had cross-column correlation stats we could detect this pitfall,
> > but without that it's hard to do.
> 
> But as far as I can see, apart from the absolute extremes, the
> index-only scan is _always_ going to be quicker than the index+table
> scan.

We are talking about an "absolute extreme" here. You have about 420 date
values and you are looking for 3 of them. Assuming for the moment that
your distribution is uniform, that's 140th of the whole table.

So if PostgreSQL were using the (sc_date,sc_id) index, it would have so
scan 4E6/140 = 29000 index entries, extract the id value and get the
minumum of those 29000 values.

OTOH, if it uses the sc_id index, it only expects to have to scan 140
entries until it finds a matching entry. And then it is finished.

So it's 140 index entries plus row accesses against 29000 index entries.
To choose the second plan, the planner would have to estimate that
reading a random row is more than 200 times slower than reading an index
entry, which apparently it doesn't.

As Tom wrote, the estimate of having to read only about 140 rows is only
valid if sc_id and sc_date are uncorrelated. In reality your query has
to read a lot more than 140 rows, so it is much slower.

> I don't believe you need any further statistics than what is currently
> available to be able to make that judgement, and that's why I believe
> it's suboptimal.

We all know it is suboptimal, but unfortunately, without additional
statistics I don't think there is a better way. The other way around -
assuming that the columns are correlated in the worst possible way -
would remove viable plans in many cases. 

This is, I think one of the places where hints are a good idea. The
programmer sometimes knows more about the characteristics of the data
than the planner can possibly know and it is a pity that there is no way
for the programmer to pass that knowledge to the planner. (And yes, I
know that quite often the programmer is wrong - but I do believe in
giving people enough rope to hang themselves with)

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

Re: [GENERAL] BRIN Usage

2016-02-21 Thread Peter J. Holzer

On 2016-02-18 13:37:37 -0500, Tom Smith wrote:
> it is for reducing index size as the table become huge. 
> sorry for confusion, by timestamp, I meant a time series number, not the sql
> timestamp type.
> I need the unique on the column to ensure no duplicate,   but the btree index
> is getting
> huge so BRIN seems to solve problem but can not ensure unique

If it is getting huge, then this is because there are a large number of
timestamps. If you want an index to ensure uniqueness, it will have to
store every value. I don't think there's a way around that. 

With a BRIN index, you would only get a list of page ranges which could
possibly contain the new value. All these pages would then have to be
scanned sequentially to make sure it isn't already there. That could be
implemented, but it would make inserts very slow - I don't think you
would want that on a huge table even if postgres implemented it.

hp

signature.asc
Description: Digital signature

Re: [GENERAL] strange sql behavior

2016-02-06 Thread Peter J. Holzer

On 2016-02-01 12:35:35 -0600, Yu Nie wrote:
> Recently I am working with a large amount of taxis GIS data and had 
> encountered
> some weird performance issues.  I am hoping someone in this community can help
> me figure it out.
> 
> The taxi data were loaded in 5 minute block into a table.  I have two separate
> such tables, one stores a month of data with about 700 million rows, another
> stores about 10 days of data with about 300 million rows.  The two tables have
> the exactly same schema and indexes. There are two indexes: one on taxiid
> (text), and the other on the time stamp (date time).  In order to process the
> data, I need to get all points for a single taxis; to do that, I use something
> like:
>  select * from table1 where taxiid = 'SZB00S41' order by time;
> What puzzled me greatly is that this query runs consistently much faster for
> the large table than for the small table, which seems to contradict with
> intuition.
[...]

> Results for the small table: it took 141 seconds to finish.  The planning time
> is 85256.31
> 
> "Sort  (cost=85201.05..85256.31 rows=22101 width=55) (actual time=
> 141419.499..141420.025 rows=20288 loops=1)"
> "  Sort Key: "time""
> "  Sort Method: quicksort  Memory: 3622kB"
> "  Buffers: shared hit=92 read=19816"
> "  ->  Bitmap Heap Scan on data2013_01w  (cost=515.86..83606.27 rows=22101
> width=55) (actual time=50.762..141374.777 rows=20288 loops=1)"
> "    Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
> "    Heap Blocks: exact=19826"
> "    Buffers: shared hit=92 read=19816"
  ^^
[...]
> 
> Results for the large table: it took 5 seconds to finish.  The planning time 
> is
> 252077.10
> "Sort  (cost=251913.32..252077.10 rows=65512 width=55) (actual time=
> 5038.571..5039.765 rows=44204 loops=1)"
> "  Sort Key: "time""
> "  Sort Method: quicksort  Memory: 7753kB"
> "  Buffers: shared hit=2 read=7543"
> "  ->  Bitmap Heap Scan on data2011_01  (cost=1520.29..246672.53 rows=65512
> width=55) (actual time=36.935..5017.463 rows=44204 loops=1)"
> "    Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
> "    Heap Blocks: exact=7372"
> "    Buffers: shared hit=2 read=7543"
 ^
[]

The obvious difference is that the query for the smaller table needs to
read about 2.5 times as many blocks (for 1/3 of the records) from the
disk. This suggests that the data for a single taxi is more localized in
the larger table. In addition, the average time per block on the smaller
table is about 7 ms, which is a typical random seek time for a disk. So
the blocks are probably randomly scattered through the table. For the
larger table, the average time is well below 1 ms, so there are probably
many consecutive blocks to read.

There are 2880 5 minute intervals in 10 days. You have about 22k records
per taxi, so there are about 7.6 records for each taxi per interval.
This is very close to the number of records per block in your second
query (65512/7372 = 8.9). I suspect that the records in your larger
table are sorted by taxiid within each interval. 

You can almost certainly get a similar speedup by sorting each 5 minute
interval by taxi id before appending it to the table.

If querying by taxiid is typical and your table is static, you should
consider clustering the table by taxiid. If your table is updated every
5 minutes, you could partition it by day and cluster each partition as
soon as it is not written any more.

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

Re: [GENERAL] Connecting to SQL Server from Windows using FDW

2016-01-23 Thread John J. Turner

> On 23 January 2016 at 04:40, John J. Turner <fenwayri...@gmail.com> wrote:
> On Jan 22, 2016, at 1:05 PM, ivo silvestre <ivo...@gmail.com> wrote:
> 
> > I need to create a linked server between 2 Windows servers. In one I've 
> > PostgreSQL with admin privileges and in the other MS SQL with only read 
> > access.
> >
> > I need to create a view (or a foreign table?) in PostgreSQL from a table in 
> > MS SQL in another server ...
> > ... I found GeoffMontee's Github, but I don't know how to install it on 
> > Windows...
> 
> Perhaps this link may help:
> https://www.mssqltips.com/sqlservertip/3663/sql-server-and-postgresql-foreign-data-wrapper-configuration--part-3/
> 
> The only caveat I see offhand is the use of the 'sa' account, but I can't 
> vouch for that being a required mapping.


On Jan 23, 2016, at 4:08 AM, ivo silvestre <ivo...@gmail.com> wrote:
> Hi John,
> 
> Thanks, but I already saw that link. The problem is to installing the tds_fwd 
> extension on postgres. And I don't know how to (never tried) compile it...
> 
> What comes with postgres by default is the postgres fdw, that allow to link 2 
> different postgres servers, but in this case that doesn't help me.


Sorry Ivo, my bad - I managed to gloss over the salient point for your issue in 
this link - you're on Windows, his Postgres was on Linux!

Compiling an extension on Windows, last time I tried many moons ago was 
unfortunately not successful, which involved compiling the whole pg source tree 
via MinGW.

Hazarding a thought - although it's discontinued, it might be worth checking 
out Windows Subsystem for UNIX-based Applications (SUA):
https://en.wikipedia.org/wiki/Windows_Services_for_UNIX

But I suspect anything you managed to compile with that would still be 
incompatible with your pg instance...

Beyond that, perhaps the gurus here have some sage advice...

- John

Re: [GENERAL] Connecting to SQL Server from Windows using FDW

2016-01-22 Thread John J. Turner

On Jan 22, 2016, at 1:05 PM, ivo silvestre  wrote:

> I need to create a linked server between 2 Windows servers. In one I've 
> PostgreSQL with admin privileges and in the other MS SQL with only read 
> access.
> 
> I need to create a view (or a foreign table?) in PostgreSQL from a table in 
> MS SQL in another server ...
> ... I found GeoffMontee's Github, but I don't know how to install it on 
> Windows...

Perhaps this link may help:
https://www.mssqltips.com/sqlservertip/3663/sql-server-and-postgresql-foreign-data-wrapper-configuration--part-3/

The only caveat I see offhand is the use of the 'sa' account, but I can't vouch 
for that being a required mapping.

Cheers,
John



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] plperlu stored procedure seems to freeze for a minute

2015-12-04 Thread Peter J. Holzer

On 2015-12-03 10:02:18 -0500, Tom Lane wrote:
> "Peter J. Holzer" <hjp-pg...@hjp.at> writes:
> > Can those signals be safely ignored? Just blocking them (so that they
> > are delivered after the UDF finishes) might be safer. But even that may
> > be a problem: If the UDF then executes some SQL, could that rely on
> > signals being delivered? I have no idea. 
> 
> The minute you start fooling with a backend's signal behavior, we're
> going to politely refuse to support whatever breakage you run into.

As I understood Jim he was talking about possible changes to postgresql
to shield UDFs from those signals, not something the author of a UDF
should do.

> We aren't sending those signals just for amusement's sake.

Right. That's why I was sceptical whether those signals could be
ignored. I wouldn't have thought so, but Jim clearly knows a lot more
about the inner workings of postgresql than I do (which is easy - I know
almost nothing) and maybe he knows of a way (something like "we can
ignore signals while executing the UDF and just assume that we missed at
least one signal and call the magic synchronize state function
afterwards")

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

Re: [GENERAL] plperlu stored procedure seems to freeze for a minute

2015-12-03 Thread Peter J. Holzer

On 2015-12-02 19:07:55 -0600, Jim Nasby wrote:
> On 12/2/15 9:26 AM, Peter J. Holzer wrote:
> >As explained in backend/utils/misc/timeout.c, the timers are never
> >cancelled: If a timeout is cancelled, postgres just sees that it has
> >nothing to do and resumes whatever it is doing.
> 
> Hrm, if those timers are really just for auth purposes then perhaps they
> should be cancelled. But aside from that, there's certainly other things
> that can signal a backend (including fairly normal things, like DDL).

Yep. I noticed that, too. In one of my test runs I got two signals
instead of the one I expected. Checking the logs I found that it seemed be
caused by another user dropping a table.

> Offhand I don't think functions run in a CRITICAL block (and I don't think
> it'd be a good idea for them to). So really, functions have to be handle
> being interrupted.

Right. I think that should be mentioned somewhere in the manual.
Something like this:

Note: PostgreSQL uses signals for various purposes. These signals
may be delivered while a user-defined function is executed.
Therefore user-defined functions must be able to handle being
interrupted, in particular they must expect system calls to fail
with errno=EINTR and handle that case appropriately.

I'm not sure wether that's an issue with all procedural languages. If it
is, it should probable go into "Chapter 39. Procedural Languages". If it
is specific to plperl(u), I would put it in "42.8.2. Limitations and
Missing Features".

> Yeah, it'd be nice to detect that this had happened. Or maybe it's worth it
> to ignore SIGALARM while a UDF is running.

Can those signals be safely ignored? Just blocking them (so that they
are delivered after the UDF finishes) might be safer. But even that may
be a problem: If the UDF then executes some SQL, could that rely on
signals being delivered? I have no idea. 

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

Re: [GENERAL] plperlu stored procedure seems to freeze for a minute

2015-12-02 Thread Peter J. Holzer

On 2015-12-01 20:55:02 +0100, Peter J. Holzer wrote:
> On 2015-12-01 18:58:31 +0100, Peter J. Holzer wrote:
> > I suspect such an interaction because I cannot reproduce the problem
> > outside of a stored procedure. A standalone Perl script doing the same
> > requests doesn't get a timeout.
[...]
> The strace doesn't show a reason for the SIGALRM, though. No alarm(2) or
> setitimer(2) system call (I connected strace to a running postgres
> process just after I got the prompt from "psql" and before I typed
> "select * from mb_search('export');" (I used a different (but very
> similar) stored procedure for those tests because it is much easier to
> find a search which is slow enough to trigger a timeout at least
> sometimes than a data request (which normally finishes in
> milliseconds)).
> 
> So I guess my next task will be to find out where that SIGALRM comes
> from and/or whether I can just restart the zmq_msg_recv if it happens. 

Ok, I think I know where that SIGALRM comes from: It's the
AuthenticationTimeout. What I'm seeing in strace (if I attach it early
enough) is that during authentication the postgres worker process calls
setitimer with a 60 second timeout twice. This matches the comment in
backend/postmaster/postmaster.c:

 * Note: AuthenticationTimeout is applied here while waiting for the
 * startup packet, and then again in InitPostgres for the duration of 
any
 * authentication operations.  So a hostile client could tie up the
 * process for nearly twice AuthenticationTimeout before we kick him 
off.

As explained in backend/utils/misc/timeout.c, the timers are never
cancelled: If a timeout is cancelled, postgres just sees that it has
nothing to do and resumes whatever it is doing. 

This is also what I'm seeing: 60 seconds after start, the process
receives a SIGALRM. 

If the process is idle or in a "normal" SQL statement at the time, thats
not a problem. But if it is in one of my stored procedures which is
currently calling a ØMQ function which is waiting for some I/O
(zmq_msg_recv(), most likely), that gets interrupted and returns an
error which my code doesn't know how to handle (yet). So the error gets
back to the user. 

A strange interaction between postgres and ØMQ indeed. But now that I
know what's causing it I can handle that. Thanks for your patience.

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

Re: [GENERAL] plperlu stored procedure seems to freeze for a minute

2015-12-01 Thread Peter J. Holzer

On 2015-12-01 13:13:27 -0500, Tom Lane wrote:
> "Peter J. Holzer" <hjp-pg...@hjp.at> writes:
> > Postgres worker processes are single-threaded, are they? Is there
> > something else which could interact badly with a moderately complex
> > multithreaded I/O library used from a stored procedure? 
> 
> Yes, lots.  If you cause additional threads to appear inside a backend
> process, things could break arbitrarily badly.  It's up to you to ensure
> that none of those extra threads ever escape to execute any non-Perl
> code.

Actually, non-�MQ code. Perl doesn't like to be unexpectedly
multithreaded either. Yes, those threads should only ever execute code
from the �MQ library. In fact they are automatically created and
destroyed by the library and there is no way to control them from Perl
code (there may be a way to do that from the C API, but I don't remember
seeing that in the manual).

> I suspect this could easily explain the problems you're seeing.

Quite.

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

Re: [GENERAL] plperlu stored procedure seems to freeze for a minute

2015-12-01 Thread Peter J. Holzer

On 2015-12-01 07:16:04 -0800, Adrian Klaver wrote:
> On 12/01/2015 06:51 AM, Peter J. Holzer wrote:
> >A rather weird observation from the log files of our server (9.5 beta1):
> >
> >2015-12-01 09:23:37 CET [26265]: [42-1] user=fragner,db=wds,pid=26265 
> >WARNING:  Use of uninitialized value $success in concatenation (.) or string 
> >at /usr/local/share/perl/5.20.2/WDS/Macrobond/Utils.pm line 36.
> >2015-12-01 09:23:37 CET [26265]: [43-1] user=fragner,db=wds,pid=26265 
> >CONTEXT:  PL/Perl function "mb_timeseriesdata_zmq"
> >[lots of other stuff from different connections]
> >2015-12-01 09:24:45 CET [26265]: [44-1] user=fragner,db=wds,pid=26265 ERROR: 
> > impossible result '' (payload=) at 
> >/usr/local/share/perl/5.20.2/WDS/Macrobond/Utils.pm line 36.
> > WDS::Macrobond::Utils::decode_result("") called at line 30
> > main::__ANON__("gen_wqehur") called at -e line 0
> > eval {...} called at -e line 0
> >
> >Two messages from the same line of the same plperlu stored procedure, 68
> >seconds apart. So what is this line 36?
> >
> > confess "impossible result '$success' (payload=$payload)";
> >
> >What? The first message clearly comes from interpolating $success
> >(which is undef at that point) into the argument. The second from
> >confess itself. What could cause a plperlu procedure to freeze for 68
> >seconds between the call to confess and its output?
> >
> >Is it possible that only the writing of the log entry is delayed?
> >
> >Another weird thing: $success is undef because a ØMQ rpc call[1] timed
> 
> And the call is?

The sequence is:

my $req_sck = zmq_socket($context, ZMQ_REQ);
zmq_connect($req_sck, $url);
my $qry_msg = join(...);
zmq_send($req_sck, $qry_msg);
my $res_msg = zmq_msg_init();
my $rv = zmq_msg_recv($res_msg, $req_sck);
# check rv here.
my $data = zmq_msg_data($res_msg); # $data is "" here
my $result = WDS::Macrobond::Utils::decode_result($data); # the error messages 
are from this function

(Yeah, ØMQ is quite low-level. There is a higher level Perl Module, but
I'm not using it). 

I omitted that because I don't think it's terribly relevant here.
Details of the usage of ØMQ are better discussed on the ØMQ mailing
list.

But there is something else which may be relevant: ØMQ uses threads
internally, and I don't actually know whether zmq_msg_recv returning
means that the read(2) call (or whatever) on the socket terminates.
It may actually continue in another thread. But I still don't see how
that could block the main thread (or wake it up again in a place which
has nothing to do with ØMQ (confess is a standard Perl function to print
a stack trace and die)). Or - just thinking aloud here - I fear I'm
abusing you guys as support teddy bears[1] - maybe it's the other way
round: confess dies, so maybe it frees some lock during cleanup which
allows the message which should have been sent by zmq_send to finally go
out on the wire. But that still doesn't explain the 68 seconds spent in
confess ...

Postgres worker processes are single-threaded, are they? Is there
something else which could interact badly with a moderately complex
multithreaded I/O library used from a stored procedure? 

I suspect such an interaction because I cannot reproduce the problem
outside of a stored procedure. A standalone Perl script doing the same
requests doesn't get a timeout.

I guess Alvaro is right: I should strace the postgres worker process
while it executes the stored procedure. The problem of course is that
it happens often enough be annoying, but rarely enough that it's not
easily reproducible.

> >out (after 60 seconds, which is also a mystery, because ØMQ doesn't seem
> >to have a default timeout of 60 seconds, and I don't set one). But at
> 
> Network timeout?

That was my first guess, but I don't see where it would come from. Or
why it only is there if I call the code from a stored procedure, not
from a standalone script.

> >09:24:45 (i.e. the time of the error message) the answer for that RPC
> >call arrived. So it kind of looks like confess waited for the message to
> >arrive (which makes no sense at all) or maybe that confess waited for
> >something which also blocked the sending of the request (because
> >according to the server logs, the RPC request only arrived there at
> >09:24:45 and was answered within 1 second), but that doesn't make any
> 
> So if the request timed out how did you get a reply, a second request?

Nope. I don't really "get" the reply. I just see in the logs of the
other server that it sent a reply at that time. 

The time line is like this

timepostgres processmb_dal process
T   zmq_send()
zmq_msg_recv()

Re: [GENERAL] plperlu stored procedure seems to freeze for a minute

2015-12-01 Thread Peter J. Holzer

On 2015-12-01 10:20:09 -0800, Adrian Klaver wrote:
> On 12/01/2015 09:58 AM, Peter J. Holzer wrote:
> >On 2015-12-01 07:16:04 -0800, Adrian Klaver wrote:
> >>On 12/01/2015 06:51 AM, Peter J. Holzer wrote:
> >>>A rather weird observation from the log files of our server (9.5 beta1):
> >>>
> >>>2015-12-01 09:23:37 CET [26265]: [42-1] user=fragner,db=wds,pid=26265 
> >>>WARNING:  Use of uninitialized value $success in concatenation (.) or 
> >>>string at /usr/local/share/perl/5.20.2/WDS/Macrobond/Utils.pm line 36.
> >>>2015-12-01 09:23:37 CET [26265]: [43-1] user=fragner,db=wds,pid=26265 
> >>>CONTEXT:  PL/Perl function "mb_timeseriesdata_zmq"
> >>>[lots of other stuff from different connections]
> >>>2015-12-01 09:24:45 CET [26265]: [44-1] user=fragner,db=wds,pid=26265 
> >>>ERROR:  impossible result '' (payload=) at 
> >>>/usr/local/share/perl/5.20.2/WDS/Macrobond/Utils.pm line 36.
> >>> WDS::Macrobond::Utils::decode_result("") called at line 30
> >>> main::__ANON__("gen_wqehur") called at -e line 0
> >>> eval {...} called at -e line 0
> >>>
> >>>Two messages from the same line of the same plperlu stored procedure, 68
> >>>seconds apart. So what is this line 36?
> >>>
> >>> confess "impossible result '$success' (payload=$payload)";
> >>>
> >>>What? The first message clearly comes from interpolating $success
> >>>(which is undef at that point) into the argument. The second from
> >>>confess itself. What could cause a plperlu procedure to freeze for 68
> >>>seconds between the call to confess and its output?
> >>>
> >>>Is it possible that only the writing of the log entry is delayed?
> >>>
> >>>Another weird thing: $success is undef because a ØMQ rpc call[1] timed
> >>
> >>And the call is?
> >
> >The sequence is:
> >
> >my $req_sck = zmq_socket($context, ZMQ_REQ);
> >zmq_connect($req_sck, $url);
> >my $qry_msg = join(...);
> >zmq_send($req_sck, $qry_msg);
> >my $res_msg = zmq_msg_init();
> >my $rv = zmq_msg_recv($res_msg, $req_sck);
> ># check rv here.
> >my $data = zmq_msg_data($res_msg); # $data is "" here
> >my $result = WDS::Macrobond::Utils::decode_result($data); # the error 
> >messages are from this function
> >
> >(Yeah, ØMQ is quite low-level. There is a higher level Perl Module, but
> >I'm not using it).
> >
> >I omitted that because I don't think it's terribly relevant here.
> >Details of the usage of ØMQ are better discussed on the ØMQ mailing
> >list.
> >
> >But there is something else which may be relevant: ØMQ uses threads
> >internally, and I don't actually know whether zmq_msg_recv returning
> 
> Except I see this here:
> 
> http://api.zeromq.org/4-0:zmq-socket
> 
> Thread safety
> 
> ØMQ sockets are not thread safe. Applications MUST NOT use a socket from
> multiple threads except after migrating a socket from one thread to another
> with a "full fence" memory barrier.

Well yes, but I don't use a ØMQ socket in multiple threads, It is
created in the stored procedure and destroyed at the end (just checked
the strace output: Yes it is. For a moment I wasn't sure whether lexical
variables in plperlu procedures go out of scope.). It's the ØMQ library
itself which creates extra threads (And it should terminate them
properly and afaics from strace it does).


[...]
> From here:
> 
> http://api.zeromq.org/4-0:zmq-connect
> 
> It seems something like(I am not a Perl programmer, so approach carefully):
> 
> my $rc = zmq_connect($req_sck, $url);
> 
> Then you will have an error code to examine.

Yes. I have been a bit sloppy with error checking. I check only the
return value of the zmq_msg_recv() call which returns the empty message.
It is possible that the problem actually occurs earlier and I should
check those calls as well. Mea culpa.

However, in at least one case the failed call was indeed zmq_msg_recv()
not one of the earlier ones (see my next mail).

> Have you looked at the Notes at the bottom of this page:
> 
> http://www.postgresql.org/docs/9.4/interactive/plperl-trusted.html

I have. I don't think that's a problem here: Debian perl is built with
with both multiplicity and ithreads, and I would assume that the .deb
packages from postgresql.org use the shared library provided by the
system.

But even if that wasn't the case it should not be a problem as only
plperlu stored procedures are called.

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature

Re: [GENERAL] plperlu stored procedure seems to freeze for a minute

2015-12-01 Thread Peter J. Holzer

On 2015-12-01 18:58:31 +0100, Peter J. Holzer wrote:
> I suspect such an interaction because I cannot reproduce the problem
> outside of a stored procedure. A standalone Perl script doing the same
> requests doesn't get a timeout.
> 
> I guess Alvaro is right: I should strace the postgres worker process
> while it executes the stored procedure. The problem of course is that
> it happens often enough be annoying, but rarely enough that it's not
> easily reproducible.

I did manage to catch a timeout once with strace in the mean time,
although that one was much more straightforward and less mysterious than
the original case: postgres process sends message, about 10 seconds
later it receives a SIGALRM which interrupts an epoll, reply hasn't yet
arrived, error message to client and log file. No waits in functions
which shouldn't wait or messages which arrive much later than they were
(presumably) sent.

The strace doesn't show a reason for the SIGALRM, though. No alarm(2) or
setitimer(2) system call (I connected strace to a running postgres
process just after I got the prompt from "psql" and before I typed
"select * from mb_search('export');" (I used a different (but very
similar) stored procedure for those tests because it is much easier to
find a search which is slow enough to trigger a timeout at least
sometimes than a data request (which normally finishes in
milliseconds)).

So I guess my next task will be to find out where that SIGALRM comes
from and/or whether I can just restart the zmq_msg_recv if it happens. 

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

signature.asc
Description: Digital signature

[GENERAL] plperlu stored procedure seems to freeze for a minute

2015-12-01 Thread Peter J. Holzer

A rather weird observation from the log files of our server (9.5 beta1):

2015-12-01 09:23:37 CET [26265]: [42-1] user=fragner,db=wds,pid=26265 WARNING:  
Use of uninitialized value $success in concatenation (.) or string at 
/usr/local/share/perl/5.20.2/WDS/Macrobond/Utils.pm line 36.
2015-12-01 09:23:37 CET [26265]: [43-1] user=fragner,db=wds,pid=26265 CONTEXT:  
PL/Perl function "mb_timeseriesdata_zmq" 
[lots of other stuff from different connections]
2015-12-01 09:24:45 CET [26265]: [44-1] user=fragner,db=wds,pid=26265 ERROR:  
impossible result '' (payload=) at 
/usr/local/share/perl/5.20.2/WDS/Macrobond/Utils.pm line 36. 
WDS::Macrobond::Utils::decode_result("") called at line 30
main::__ANON__("gen_wqehur") called at -e line 0
eval {...} called at -e line 0

Two messages from the same line of the same plperlu stored procedure, 68
seconds apart. So what is this line 36?

confess "impossible result '$success' (payload=$payload)";

What? The first message clearly comes from interpolating $success
(which is undef at that point) into the argument. The second from
confess itself. What could cause a plperlu procedure to freeze for 68
seconds between the call to confess and its output?

Is it possible that only the writing of the log entry is delayed?

Another weird thing: $success is undef because a ØMQ rpc call[1] timed
out (after 60 seconds, which is also a mystery, because ØMQ doesn't seem
to have a default timeout of 60 seconds, and I don't set one). But at
09:24:45 (i.e. the time of the error message) the answer for that RPC
call arrived. So it kind of looks like confess waited for the message to
arrive (which makes no sense at all) or maybe that confess waited for
something which also blocked the sending of the request (because
according to the server logs, the RPC request only arrived there at
09:24:45 and was answered within 1 second), but that doesn't make any
sense either. (Just noticed that 60 + 68 == 128, which is also a round
number).

hp


[1] ØMQ is an IPC framework: See http://zeromq.org/ We use it to make
RPC calls from stored procedures to a server process.


-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature

Re: [GENERAL] DISTINCT in STRING_AGG

2015-11-28 Thread John J. Turner

On Nov 28, 2015, at 1:35 PM, Sterpu Victor  wrote:
> Hello
>  
> Can I make a distinct STRING_AGG?
> This is my query :
> SELECT atjs.id, STRING_AGG(CAST(aqjs1.id AS VARCHAR), ',' ORDER BY 
> aqjs1.to_left) AS children
> FROM administration.ad_query_join_select atjs 
> JOIN administration.ad_query aq ON (aq.id=atjs.id_ad_query) 
> LEFT JOIN administration.ad_query_join_select aqjs1 ON (aqjs1.id_ad_query = 
> atjs.id_ad_query AND aqjs1.to_left>atjs.to_left AND 
> aqjs1.to_right LEFT JOIN administration.ad_query_join_select aqjs2 ON (aqjs2.id_ad_query = 
> atjs.id_ad_query AND aqjs2.to_left>atjs.to_left AND 
> aqjs2.to_right aqjs2.to_right>aqjs1.to_right) 
> LEFT JOIN administration.ad_query_join_select aqjs3 ON (aqjs3.id_ad_query = 
> atjs.id_ad_query AND aqjs3.to_left aqjs3.to_right>atjs.to_right)
> WHERE aqjs2.id IS NULL AND atjs.id_ad_query = 475543
> GROUP BY aq.id, atjs.id 
> ORDER BY aq.id ASC, atjs.to_left ASC;
>  
> And "childen" contain doubles. The result is:
> id  ; children
> 1399029;"1399031,1399031"
> 1399031;"1399032,1399032,1399032,1399033,1399033,1399033"
>  
> There are doubles because of the join aqjs3 witch is producing this 
> problem.Can I make it so the children ID's are unique?

Just to mention, this looks like a good candidate for range types and CTE’s.

The > / < comparisons appear to be mutually exclusive in each LEFT JOIN clause, 
so it’s not apparent why aqjs3 is causing duplication, as you’ve stated.

As far as I can see, without providing us with your table constraints/keys, 
there’s no way to determine what makes your ID values unique…

However, if you defer your STRING_AGG until after you derive a distinct 
“staging” result set from the joins, then you can effect uniqueness -
e.g. (air code):
   WITH q AS 
  (SELECT aq.id aq_parent_id, atjs.id atjs_parent_id, CAST(aqjs1.id AS 
VARCHAR) child FROM ... GROUP BY aq.id, atjs.id, aqjs1.id)
   SELECT atjs_parent_id, STRING_AGG(child,’,’ ORDER BY aqjs.to_left) children 
   FROM q LEFT JOIN (SELECT DISTINCT id_ad_query, to_left FROM 
administration.ad_query_join_select) aqjs ON …
   GROUP BY aq_parent_id, parent
   ORDER BY aq_parent_id, atjs.to_left;

Something along these lines ‘may’ produce a unique set of child values for each 
id by which to perform a STRING_AGG on, but again, I can only guess based on 
the lack of definition provided for your table constraints.

John

Re: [GENERAL] How can I change defined schema of linked tables when using Access linked table manager odbc connection

2015-10-29 Thread John J. Turner

On Oct 29, 2015, at 6:14 AM, Killian Driscoll  wrote:
> I am using postgresql 9.3 on Windows 8 64, and am using Access as a frontend 
> data entry. In postgresql I have changed the schema from 'public' to 'irll'. 
> The linked table in Access are still linked as 'public' and I need to 
> update/change the connection of the tables to match the new schema in the 
> postgresql db.
> 
> I can easily set up a new odbc connection with the new schema, but if I do 
> this I will lose the data input forms I have set up in Access. How can I 
> change the defined schema of the linked tables? 


This sounds more like an Access question, so an inquiry to an Access forum may 
be more fruitful, but have you tried the linked table manager in Access?

You can swap the existing linked table objects to their new connection…  

-John




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Upgrade from postgresql-9.5alpha1 to postgresql-9.5alpha2 (Debian packages)

2015-08-18 Thread Peter J. Holzer

It looks like the catalog version has changed between 9.5alpha1 and
9.5alpha2:

 FATAL:  database files are incompatible with server
 DETAIL:  The database cluster was initialized with CATALOG_VERSION_NO 
201506282,
  but the server was compiled with CATALOG_VERSION_NO 201507281.
 HINT:  It looks like you need to initdb.

Normally, one would use pg_upgradecluster to do the upgrade, but
pg_upgradecluster assumes that the old and new version are installed in
parallel. Likewise, the low-level tool pg_upgrade needs the old bindir,
if I read the man-page correctly, and of course, apt-get upgrade
overwrites that, since it's just two versions of the same package
(unlike a major upgrade which is a new package). 

So, what's the best way to do the upgrade?

* Copy the bindir before the upgrade (or restore from backup) to a safe
  place and do pg_upgrade?
* Initdb a new cluster and restore yesterdays backup? 
* Something else?

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature

Re: [GENERAL] Upgrade from postgresql-9.5alpha1 to postgresql-9.5alpha2 (Debian packages)

2015-08-18 Thread Peter J. Holzer

On 2015-08-18 20:40:10 +0900, Masahiko Sawada wrote:
 On Tue, Aug 18, 2015 at 6:06 PM, Peter J. Holzer hjp-pg...@hjp.at wrote:
  It looks like the catalog version has changed between 9.5alpha1 and
  9.5alpha2:
[...]
  So, what's the best way to do the upgrade?
 
  * Copy the bindir before the upgrade (or restore from backup) to a safe
place and do pg_upgrade?
  * Initdb a new cluster and restore yesterdays backup?
  * Something else?
 
 
 I've not used pg_upgrade at such case, but If you have a enough time
 to do upgrading, I will suggest to take backup(pg_dumpall) from
 current cluster and then restore it to new cluster.

Of course you would have to make a backup before the upgrade to restore
it afterwards. I could of course have forcibly downgraded to alpha1
again and made a new backup, but since this is a test system I just
decided to drop and recreate the cluster and restore yesterday's backup.
(One of my colleagues won't be too pleased about that, I think)

 I think pg_upgrade is basically used at major version upgrading.

This was basically a major version upgrade, the problem was that it
wasn't reflected in the package/directory structure (normally the
postgres debian packages are quite fastidious in separating everything
so that you have both an old and a new installation in the places where
pg_upgradecluster expects them), and that I didn't expect it (the
possibility of catalog version changes from one alpha release to the
next was discussed before the release of alpha1, but I somehow
classified that as theoretically possible but not likely - my fault),
and finally that I don't really understand the finer points of
pg_upgrade (I managed to use it in a similar situation some time ago,
but I had to read the source code of pg_upgradecluster (and I think I
even single-stepped through it in the debugger) to figure out the
parameters and unfortunately I didn't take notes). 

No big harm done (alpha software on a test system - I expect things to
blow up once in a while), but maybe the person preparing the alpha
releases can figure out how to make the upgrade smoother. At least a
warning in the release announcement would be nice (wouldn't have helped
me as I have to admit that I read that only after I upgraded, but it
would help those who do things in the right order ;-) ).

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature

Re: [GENERAL] Queries for unused/useless indexes

2015-05-25 Thread Peter J. Holzer

On 2015-05-22 09:41:57 -0400, Melvin Davidson wrote:
 I'd like to share those queries with the community, as I know there must be
 others out there with the same problem.
 
 /* useless_indexes.sql */
 SELECT
    idstat.schemaname AS schema,
    idstat.relname AS table_name,
    indexrelname AS index_name,
    idstat.idx_scan AS times_used,
    pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) || '.' 
 ||
 quote_ident(idstat.relname))) AS table_size,
    pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) || '.' 
 ||
 quote_ident(indexrelname))) AS index_size,
    n_tup_upd + n_tup_ins + n_tup_del as num_writes,
    indexdef AS definition
 FROM pg_stat_user_indexes AS idstat
 JOIN pg_indexes ON indexrelname = indexname
 JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
 WHERE idstat.idx_scan  200
 AND indexdef !~* 'unique'
 ORDER BY idstat.schemaname,
  idstat.relname,
  indexrelname;

Thanks, that's useful.

However, it doesn't quite work if there are indexes with the same name
in different schemas. Better join on the schemaname, too:

FROM pg_stat_user_indexes AS idstat
JOIN pg_indexes AS idx ON indexrelname = indexname and idstat.schemaname = 
idx.schemaname
JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname and 
idstat.schemaname = tabstat.schemaname

(for some reason that makes it a lot slower, though)

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature

Re: [GENERAL] Queries for unused/useless indexes

2015-05-25 Thread Peter J. Holzer

On 2015-05-25 12:25:01 -0400, Melvin Davidson wrote:
 I'm not sure why you are using pg_stat_user_indexes.

Because you did. I didn't change that.

 My original query below
 uses pg_stat_all_indexes and the schema names are joined and it does work.

I'm not sure what you mean by original, but this:

 SELECT n.nspname as schema,
    i.relname as table,
    i.indexrelname as index,
    i.idx_scan,
    i.idx_tup_read,
    i.idx_tup_fetch,
    pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
 quote_ident(i.relname))) AS table_size,
    pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
 quote_ident(i.indexrelname))) AS index_size,
    pg_get_indexdef(idx.indexrelid) as idx_definition
   FROM pg_stat_all_indexes i
   JOIN pg_class c ON (c.oid = i.relid)
   JOIN pg_namespace n ON (n.oid = c.relnamespace)
   JOIN pg_index idx ON (idx.indexrelid =  i.indexrelid )
  WHERE i.idx_scan  200
    AND NOT idx.indisprimary
    AND NOT idx.indisunique
  ORDER BY 1, 2, 3;

is not the query you posted in your original message. 

Here is what you posted:

 On Mon, May 25, 2015 at 10:41 AM, Peter J. Holzer hjp-pg...@hjp.at wrote:
 
 On 2015-05-22 09:41:57 -0400, Melvin Davidson wrote:
  I'd like to share those queries with the community, as I know there must
 be
  others out there with the same problem.
 
  /* useless_indexes.sql */
  SELECT
     idstat.schemaname AS schema,
     idstat.relname AS table_name,
     indexrelname AS index_name,
     idstat.idx_scan AS times_used,
     pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
 '.' ||
  quote_ident(idstat.relname))) AS table_size,
     pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
 '.' ||
  quote_ident(indexrelname))) AS index_size,
     n_tup_upd + n_tup_ins + n_tup_del as num_writes,
     indexdef AS definition
  FROM pg_stat_user_indexes AS idstat
  JOIN pg_indexes ON indexrelname = indexname
  JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
  WHERE idstat.idx_scan  200
  AND indexdef !~* 'unique'
  ORDER BY idstat.schemaname,
   idstat.relname,
   indexrelname;

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature

Re: [GENERAL] quick q re execute scope of new

2015-04-03 Thread Andrew J. Kopciuch

On April 2, 2015, Scott Ribe wrote:
 On Apr 2, 2015, at 10:14 PM, Adrian Klaver adrian.kla...@aklaver.com wrote:
  EXECUTE 'insert into ' || quote_ident(tblname) || ' values(' || new.* ||
  ')'

 Not that easy, strings are not quoted correctly, and null values are blank.
 Might be a function to translate new.* into a string as needed for this
 use, but I found another way based on Tom's suggestion:

 execute('insert into ' || tblnm || ' select $1.*') using new;


I've done similar in triggers for partition schemes, something like this :

EXECUTE 'INSERT INTO ' || partitionName || ' (SELECT ( masterTableName ' || 
quote_literal(NEW) || ').*)';

I can't remember the reference I found on line that helped me get there though.
The key is doing quote_literal on the NEW, and casting it to a compatible 
type.

HTH,


Andy




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] window function ordering not working as expected

2015-02-17 Thread Lonni J Friedman

On Tue, Feb 17, 2015 at 4:18 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Lonni J Friedman netll...@gmail.com writes:
 I'm interested in seeing:
 * the date for the most recent result
 * test name (identifier)
 * most recent result (decimal value)
 * the worst (lowest decimal value) test result from the past 21 days
 * the date which corresponds with the worst test result from the past 21 days
 * the 2nd worst (2nd lowest decimal value) test result
 ...
 The problem that I'm seeing is in the prv_score column. It should show
 a value of 0.6, which corresponds with 2015-02-13, however instead its
 returning 0.7. I thought by ordering by metrics-'PT TWBR' I'd always
 be sorting by the scores, and as a result, the lead(metrics-'PT
 TWBR', 1) would give me the next greatest value of the score. Thus my
 confusion as to why ORDER BY metrics-'PT TWBR' isn't working as
 expected.

 lead() and lag() retrieve values from rows that are N away from the
 current row in the specified ordering.  That isn't what you want here
 AFAICS.

 I think the worst test result would be obtained with
 nth_value(metrics-'PT TWBR', 1)
 which is equivalent to what you used,
 first_value(metrics-'PT TWBR')
 while the 2nd worst result would be obtained with
 nth_value(metrics-'PT TWBR', 2)

 However, worst and 2nd worst with this implementation would mean
 worst and 2nd worst within the partition, which isn't the stated
 goal either, at least not with the partition definition you're using.

 What you really want for the worst in last 21 days is something like

 min(metrics-'PT TWBR') OVER (
PARTITION BY ... that same mess you used ...
ORDER BY tstamp
RANGE BETWEEN '21 days'::interval PRECEDING AND CURRENT ROW)

 However Postgres doesn't implement RANGE x PRECEDING yet.  You could
 get worst in last 21 observations easily:

 min(metrics-'PT TWBR') OVER (
PARTITION BY ... that mess ...
ORDER BY tstamp
ROWS BETWEEN 20 PRECEDING AND CURRENT ROW)

 and maybe that's close enough.

 I do not know an easy way to get second worst :-(.  You could build a
 user-defined aggregate to produce second smallest value among the inputs
 and then apply it in the same way as I used min() here.

Thanks Tom, much appreciate the fast reply.  I'll chew this over and
see if I have any other questions.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] window function ordering not working as expected

2015-02-17 Thread Lonni J Friedman

Greetings,
I have a postgresql-9.3.x database with a table with a variety of date
stamped test results, some of which are stored in json format
(natively in the database). I'm attempting to use some window
functions to pull out specific data from the test results over a a
time window, but part of the results are not making sense. Some tests
run every day, others less frequently. For each unique test's results,
I'm interested in seeing:

* the date for the most recent result
* test name (identifier)
* most recent result (decimal value)
* the worst (lowest decimal value) test result from the past 21 days
* the date which corresponds with the worst test result from the past 21 days
* the 2nd worst (2nd lowest decimal value) test result

Here's a sample of the data and resulting score for one test (tname)
from the past few weeks:

  tstamp   |  tname  | score
+-+
2015-02-17 | dfw001.ix-cr-02 | 0.7
2015-02-15 | dfw001.ix-cr-02 | 0.6
2015-02-14 | dfw001.ix-cr-02 | 0.6
2015-02-14 | dfw001.ix-cr-02 | 0.7
2015-02-13 | dfw001.ix-cr-02 | 0.6
2015-02-12 | dfw001.ix-cr-02 | 0.7
2015-02-11 | dfw001.ix-cr-02 | 0.7
2015-02-10 | dfw001.ix-cr-02 | 0.7
2015-02-09 | dfw001.ix-cr-02 | 0.7
2015-02-08 | dfw001.ix-cr-02 | 0.7
2015-02-08 | dfw001.ix-cr-02 | 0.5
2015-02-07 | dfw001.ix-cr-02 | 0.7
2015-02-07 | dfw001.ix-cr-02 | 0.5
2015-02-06 | dfw001.ix-cr-02 | 0.7
2015-02-05 | dfw001.ix-cr-02 | 0.7
2015-02-04 | dfw001.ix-cr-02 | 0.7
2015-01-30 | dfw001.ix-cr-02 | 0.7

Here's the SQL query that I'm running:

SELECT * FROM
(SELECT tstamp,
concat_ws('/',attrs-'RCluster ID',
regexp_replace(replace(replace(attrs-'ASN HTML','/a',''),'a href=
''http://ncapp100.prod.com/Cluster3.php?asn=',''),'\d+(d=5d''
target=''_blank'')','')) AS tname ,
metrics-'PT TWBR' AS score,
first_value(metrics-'PT TWBR') OVER
(PARTITION BY concat_ws('/',attrs-'Route Cluster
ID', regexp_replace(replace(replace(attrs-'ASN HTML','/a',''),'a
href= ''http://ncapp100.prod.com/Cluster3.php?asn=',''),'\d+(d=5d''
target=''_blank'')',''))
ORDER BY metrics-'PT TWBR') AS worst_score,
first_value(tstamp) OVER
(PARTITION BY concat_ws('/',attrs-'Route Cluster
ID', regexp_replace(replace(replace(attrs-'ASN HTML','/a',''),'a
href= ''http://ncapp100.prod.com/Cluster3.php?asn=',''),'\d+(d=5d''
target=''_blank'')',''))
ORDER BY metrics-'PT TWBR') AS worst_date,
lead(metrics-'PT TWBR', 1) OVER
(PARTITION BY concat_ws('/',attrs-'Route Cluster
ID', regexp_replace(replace(replace(attrs-'ASN HTML','/a',''),'a
href= ''http://ncapp100.prod.com/Cluster3.php?asn=',''),'\d+(d=5d''
target=''_blank'')',''))
ORDER BY metrics-'PT TWBR') AS prv_score
FROM btworks
WHERE
age(now(),tstamp)  '21 days'
ORDER BY tstamp DESC, rank
) AS stuff
WHERE
tstamp = '2015-02-17';

Here's the data from the above query as it pertains to the data
(tname='dfw001.ix-cr-02') set that I posted above:

   tstamp   | tname | score | worst_score | worst_date
| prv_score
+---+---+-++---
 2015-02-17 | dfw001.ix-cr-02   | 0.7   | 0.5 | 2015-02-08 | 0.7

The problem that I'm seeing is in the prv_score column. It should show
a value of 0.6, which corresponds with 2015-02-13, however instead its
returning 0.7. I thought by ordering by metrics-'PT TWBR' I'd always
be sorting by the scores, and as a result, the lead(metrics-'PT
TWBR', 1) would give me the next greatest value of the score. Thus my
confusion as to why ORDER BY metrics-'PT TWBR' isn't working as
expected.


thanks in advance for any pointers.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Whats is lock type transactionid?

2014-07-17 Thread Douglas J Hunley

On Thu, Jul 17, 2014 at 3:34 PM, AI Rumman rumman...@gmail.com wrote:

 Hi,

 I have been facing lock contention in my Postgresql 9.1 DB.
 And when I am querying in the pg_locks table I found a lock type with
 transactionid.
 Could someone please tell me what it means?

 Thanks.


from http://www.postgresql.org/docs/9.3/static/view-pg-locks.html :
Every transaction holds an exclusive lock on its virtual transaction ID for
its entire duration. If a permanent ID is assigned to the transaction
(which normally happens only if the transaction changes the state of the
database), it also holds an exclusive lock on its permanent transaction ID
until it ends. When one transaction finds it necessary to wait specifically
for another transaction, it does so by attempting to acquire share lock on
the other transaction ID (either virtual or permanent ID depending on the
situation). That will succeed only when the other transaction terminates
and releases its locks.

I believe that describes what you're seeing

-- 
Douglas J Hunley (doug.hun...@gmail.com)

Re: [GENERAL] Whats is lock type transactionid?

2014-07-17 Thread Douglas J Hunley

On Thu, Jul 17, 2014 at 12:54 PM, AI Rumman rumman...@gmail.com wrote:

 I am experiencing lock contention on one single UPDATE statement at a
 certain time in whole day. This is a small table to UPDATE.
 My suspect is we are facing it for one specific ID.
 Could you please let me know how can I identify the tuple.


Have you tried the lock monitoring queries on
http://wiki.postgresql.org/wiki/Lock_Monitoring yet by chance?


-- 
Douglas J Hunley (doug.hun...@gmail.com)

[GENERAL] pros/cons of using synchronous commit=off - AWS in particular

2014-06-19 Thread Larry J Prikockis

so from the much-loved 
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server page, we 
have this:



PostgreSQL can only safely use a write cache if it has a battery backup. 
See WAL reliability for an essential introduction to this topic. No, 
really; go read that right now, it's vital to understand that if you 
want your database to work right.

...
For situations where a small amount of data loss is acceptable in return 
for a large boost in how many updates you can do to the database per 
second, consider switching synchronous commit off. This is particularly 
useful in the situation where you do not have a battery-backed write 
cache on your disk controller, because you could potentially get 
thousands of commits per second instead of just a few hundred.

...


My question is-- does it make sense to switch synchronous commit off for 
EBS-backed EC2 instances running postgresql at Amazon? Has anyone done 
any benchmarking of this change on AWS? Since EBS is a black box to us 
as end users, I have no clue what type of caching- volatile or not-- may 
be going on behind the scenes.



--
Larry J. Prikockis
System Administrator
240-965-4597 (direct)
lprikoc...@vecna.com
http://www.vecna.com



Vecna Technologies, Inc.
6404 Ivy Lane Suite 500
Greenbelt, MD 20770
Phone: (240) 965-4500
Fax: (240) 547-6133

Better Technology, Better World (TM)
The contents of this message may be privileged and confidential. 
Therefore, if this message has been received in error, please delete it 
without reading it. Your receipt of this message is not intended to 
waive any applicable privilege. Please do not disseminate this message 
without the permission of the author.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Monitoring Streaming Replication in 9.2

2014-05-16 Thread J Adams

Newb question here. I have streaming replication working with 9.2 and I'm
using Bucardo's check_postgres.pl to monitor replication. I see that it
runs this query on the slave:

SELECT pg_last_xlog_receive_location() AS receive,
pg_last_xlog_replay_location() AS replay

That returns hex, which is then converted to a number in the script.

My question is this: what does that number represent? Is it just the log
position? If so, how does the log position translate to queries? Does one
log position = one query? (I did say this was a newb question.)

How do I determine a meaningful alert threshold for that value? Is there a
reliable way to monitor replication lag in seconds? How do other people
handle this?

[GENERAL] duplicate OID issue when using pg_upgrade to move from 8.4 to 9.2

2014-01-01 Thread Reiser, John J.

Hello,

I'm working on an upgrade to our database cluster, attempting to move from 8.4 
to 9.2. I'm encountering the following error when I attempt the upgrade (in 
pg_upgrade_restore.log):

CREATE FUNCTION st_envelope_in(cstring) RETURNS st_envelope
LANGUAGE c IMMUTABLE STRICT
AS 'st_geometry', 'ST_ENVELOPE_In';
psql:pg_upgrade_dump_db.sql:371910: ERROR:  duplicate key value violates unique 
constraint pg_type_oid_index
DETAIL:  Key (oid)=(1407909) already exists.

I'm running this on CentOS 6; both 8.4 and 9.2 are installed from the Yum 
repository. PostgreSQL is primarily used for GIS data and has ESRI st_geometry 
and PostGIS installed in several of the databases. (ESRI's support is only up 
to 9.2, which is why I'm not attempting a move to 9.3.) The interesting thing 
with this error is that when I wipe out the 9.2 data directory, re-initdb, and 
run the upgrade again, I now get a different error:

CREATE TABLESPACE sde1 OWNER sde LOCATION '/disk2/pgsql/data/sde';
psql:pg_upgrade_dump_globals.sql:294: ERROR:  directory 
/disk2/pgsql/data/sde/PG_9.2_201204301 already in use as a tablespace

(I have several of our ESRI SDE databases in their own tablespace.)

Before starting this process, I made a complete file-based backup of the 8.4 
data directory. When I restore the backup to /var/lib/pgsql and run pg_upgrade 
again, I receive the first error again, with the same exact OID value. I will 
admit I don't know much about Postgres internals and I'm not sure how to 
proceed with this duplicate OID issue.

I'm going to try running pg_upgrade with the link option now, but I don't know 
if that will help.

Any assistance provided would be greatly appreciated.

Thanks,
John

John Reiser
Geospatial Research Labhttp://gis.rowan.edu/
Rowan Universityhttp://rowan.edu/geography
201 Mullica Hill Road
Glassboro, NJ 08028
phone: 856-256-4817
cell: 856-347-0047
twitter: @rowangeolabhttp://twitter.com/rowangeolab

Re: [GENERAL] duplicate OID issue when using pg_upgrade to move from 8.4 to 9.2

2014-01-01 Thread Reiser, John J.

The --link argument doesn't work, either:

bash-4.1$ export LD_LIBRARY_PATH=/usr/pgsql-9.2/lib
bash-4.1$ /usr/pgsql-9.2/bin/pg_upgrade --old-datadir=/var/lib/pgsql/data 
--new-datadir=/var/lib/pgsql/9.2/data --old-bindir=/usr/bin 
--new-bindir=/usr/pgsql-9.2/bin --check
Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions   ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok
Checking for reg* system OID user data typesok
Checking for contrib/isn with bigint-passing mismatch   ok
Checking for large objects  ok
Checking for presence of required libraries ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok

*Clusters are compatible*

bash-4.1$ /usr/pgsql-9.2/bin/pg_upgrade --old-datadir=/var/lib/pgsql/data 
--new-datadir=/var/lib/pgsql/9.2/data --old-bindir=/usr/bin 
--new-bindir=/usr/pgsql-9.2/bin --link
Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions   ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok
Checking for reg* system OID user data typesok
Checking for contrib/isn with bigint-passing mismatch   ok
Checking for large objects  ok
Creating catalog dump   ok
Checking for presence of required libraries ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
--
Analyzing all rows in the new cluster   ok
Freezing all rows on the new clusterok
Deleting files from new pg_clog ok
Copying old pg_clog to new server   ok
Setting next transaction ID for new cluster ok
Resetting WAL archives  ok
Setting frozenxid counters in new cluster   ok
Creating databases in the new cluster   ok
Adding support functions to new cluster ok
Restoring database schema to new cluster*failure*

Consult the last few lines of pg_upgrade_restore.log for
the probable cause of the failure.
Failure, exiting
bash-4.1$ tail -n 20 pg_upgrade_restore.log

(1 row)

CREATE TYPE spheroid (
INTERNALLENGTH = 65,
INPUT = spheroid_in,
OUTPUT = spheroid_out,
ALIGNMENT = double,
STORAGE = plain
);
CREATE TYPE
ALTER TYPE public.spheroid OWNER TO reiser;
ALTER TYPE
SET search_path = sde, pg_catalog;
SET
CREATE FUNCTION st_envelope_in(cstring) RETURNS st_envelope
LANGUAGE c IMMUTABLE STRICT
AS 'st_geometry', 'ST_ENVELOPE_In';
psql:pg_upgrade_dump_db.sql:371910: ERROR:  duplicate key value violates unique 
constraint pg_type_oid_index
DETAIL:  Key (oid)=(1407909) already exists.

Again, any help that you could provide would be greatly appreciated.
John

From: Reiser, John Reiser rei...@rowan.edumailto:rei...@rowan.edu
Date: Wednesday, January 1, 2014 at 11:53 AM
To: pgsql-general@postgresql.orgmailto:pgsql-general@postgresql.org 
pgsql-general@postgresql.orgmailto:pgsql-general@postgresql.org
Subject: [GENERAL] duplicate OID issue when using pg_upgrade to move from 8.4 
to 9.2

Hello,

I'm working on an upgrade to our database cluster, attempting to move from 8.4 
to 9.2. I'm encountering the following error when I attempt the upgrade (in 
pg_upgrade_restore.log):

CREATE FUNCTION st_envelope_in(cstring) RETURNS st_envelope
LANGUAGE c IMMUTABLE STRICT
AS 'st_geometry', 'ST_ENVELOPE_In';
psql:pg_upgrade_dump_db.sql:371910: ERROR:  duplicate key value violates unique 
constraint pg_type_oid_index
DETAIL:  Key (oid)=(1407909) already exists.

I'm running this on CentOS 6; both 8.4 and 9.2 are installed from the Yum 
repository. PostgreSQL is primarily used for GIS data and has ESRI st_geometry 
and PostGIS installed in several of the databases. (ESRI's support is only up 
to 9.2, which is why I'm not attempting a move to 9.3.) The interesting thing 
with this error is that when I wipe out the 9.2 data directory, re-initdb, and 
run the upgrade again, I now get a different error:

CREATE TABLESPACE sde1 OWNER sde LOCATION '/disk2/pgsql/data/sde';
psql:pg_upgrade_dump_globals.sql:294: ERROR:  directory 
/disk2/pgsql/data/sde/PG_9.2_201204301 already in use as a tablespace

(I have several of our ESRI SDE databases in their own tablespace.)

Before

Re: [GENERAL] duplicate OID issue when using pg_upgrade to move from 8.4 to 9.2

2014-01-01 Thread Reiser, John J.

Adrian,


On 1/1/14, 12:26 PM, Adrian Klaver adrian.kla...@gmail.com wrote:

On 01/01/2014 09:08 AM, Reiser, John J. wrote:
 The --link argument doesn't work, either:



 Consult the last few lines of pg_upgrade_restore.log for
 the probable cause of the failure.
 Failure, exiting
 bash-4.1$ tail -n 20 pg_upgrade_restore.log
 (1 row)

 CREATE TYPE spheroid (
  INTERNALLENGTH = 65,
  INPUT = spheroid_in,
  OUTPUT = spheroid_out,
  ALIGNMENT = double,
  STORAGE = plain
 );
 CREATE TYPE
 ALTER TYPE public.spheroid OWNER TO reiser;
 ALTER TYPE
 SET search_path = sde, pg_catalog;
 SET
 CREATE FUNCTION st_envelope_in(cstring) RETURNS st_envelope
  LANGUAGE c IMMUTABLE STRICT
  AS 'st_geometry', 'ST_ENVELOPE_In';
 psql:pg_upgrade_dump_db.sql:371910: ERROR:  duplicate key value violates
 unique constraint pg_type_oid_index
 DETAIL:  Key (oid)=(1407909) already exists.

 Again, any help that you could provide would be greatly appreciated.

So have you already installed the GIS stuff into the 9.2 cluster before
the upgrade?

I have installed PostGIS 1.5.8 and the st_geometry.so file that ESRI
requires into the 9.2 directory (/usr/pgsql-9.2/lib). pg_upgrade wouldn't
proceed as far as it did without doing that.

I received your other email and will try the upgrade again and compare the
OIDs in new and old. I'll email the list again once I've done that.

Thanks again for your help.


 John



-- 
Adrian Klaver
adrian.kla...@gmail.com




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] duplicate OID issue when using pg_upgrade to move from 8.4 to 9.2

2014-01-01 Thread Reiser, John J.

On 1/1/14, 12:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Reiser, John J. rei...@rowan.edu writes:
 I'm working on an upgrade to our database cluster, attempting to move
from 8.4 to 9.2. I'm encountering the following error when I attempt the
upgrade (in pg_upgrade_restore.log):

 CREATE FUNCTION st_envelope_in(cstring) RETURNS st_envelope
 LANGUAGE c IMMUTABLE STRICT
 AS 'st_geometry', 'ST_ENVELOPE_In';
 psql:pg_upgrade_dump_db.sql:371910: ERROR:  duplicate key value
violates unique constraint pg_type_oid_index
 DETAIL:  Key (oid)=(1407909) already exists.

What this smells like is a bug in the pg_dump --binary_upgrade logic that
tries to preserve type OIDs from the old installation to the new one.
Is there a preceding CREATE TYPE command for st_envelope in the dump
script?  Look for calls to binary_upgrade.set_next_pg_type_oid() and
binary_upgrade.set_next_array_pg_type_oid() in the dump script --- are
there conflicting entries?  Also, exactly what is type 1407909 in the
old installation (try select * from pg_type where oid = 1407909)?

Once I got 8.4 back up, I searched for that OID in pg_type. select * from
pg_type where oid = 1407909; returns 0 rows.

I did find this, searching through pg_upgrade_restore.log. There are 8
instances of the following text in the file:

CREATE TYPE pgis_abs (
INTERNALLENGTH = 8,
INPUT = pgis_abs_in,
OUTPUT = pgis_abs_out,
ALIGNMENT = double,
STORAGE = plain
);
CREATE TYPE
ALTER TYPE public.pgis_abs OWNER TO reiser;
ALTER TYPE
SELECT binary_upgrade.set_next_pg_type_oid('1407909'::pg_catalog.oid);
 set_next_pg_type_oid
--
 
(1 row)

SELECT 
binary_upgrade.set_next_array_pg_type_oid('1407914'::pg_catalog.oid);
 set_next_array_pg_type_oid

 
(1 row)

CREATE TYPE spheroid;
CREATE TYPE
CREATE FUNCTION spheroid_in(cstring) RETURNS spheroid
LANGUAGE c IMMUTABLE STRICT
AS '$libdir/postgis-1.5', 'ellipsoid_in';
CREATE FUNCTION
ALTER FUNCTION public.spheroid_in(cstring) OWNER TO reiser;
ALTER FUNCTION
CREATE FUNCTION spheroid_out(spheroid) RETURNS cstring
LANGUAGE c IMMUTABLE STRICT
AS '$libdir/postgis-1.5', 'ellipsoid_out';
CREATE FUNCTION
ALTER FUNCTION public.spheroid_out(spheroid) OWNER TO reiser;
ALTER FUNCTION
SELECT binary_upgrade.set_next_pg_type_oid('1407909'::pg_catalog.oid);
 set_next_pg_type_oid
--
 
(1 row)

SELECT 
binary_upgrade.set_next_array_pg_type_oid('1407914'::pg_catalog.oid);
 set_next_array_pg_type_oid

 
(1 row)

CREATE TYPE spheroid (
INTERNALLENGTH = 65,
INPUT = spheroid_in,
OUTPUT = spheroid_out,
ALIGNMENT = double,
STORAGE = plain
);
CREATE TYPE
ALTER TYPE public.spheroid OWNER TO reiser;
ALTER TYPE
CREATE FUNCTION _st_asgeojson(integer, geometry, integer, integer)
RETURNS text
LANGUAGE c IMMUTABLE STRICT
AS '$libdir/postgis-1.5', 'LWGEOM_asGeoJson';
CREATE FUNCTION
ALTER FUNCTION public._st_asgeojson(integer, geometry, integer,
integer) OWNER TO reiser;
ALTER FUNCTION





The end of the file differs in that creation of the st_envelope_in
function is attempted instead of _st_asgeojson.

CREATE FUNCTION st_envelope_in
is only in the file 5 times (one being just before the error) and here it
is in context:

SET search_path = sde, pg_catalog;
SET
CREATE FUNCTION st_envelope_in(cstring) RETURNS st_envelope
LANGUAGE c IMMUTABLE STRICT
AS 'st_geometry', 'ST_ENVELOPE_In';
CREATE FUNCTION
ALTER FUNCTION sde.st_envelope_in(cstring) OWNER TO sde;
ALTER FUNCTION

The line
CREATE FUNCTION st_envelope_in(cstring) RETURNS st_envelope
occurs on lines 9076, 106654, 139095, 164850 and 310874 (the error)

and 
SELECT binary_upgrade.set_next_pg_type_oid('1407909'::pg_catalog.oid);
occurs 22 times on lines (1150, 1176, 44192, 44218, 64149, 64175, 71815,
71841, 79844, 79870, 88982, 89008, 97153, 97179, 106523, 106549, 289254,
289280, 297653, 297679, 310824, 310850) and all the close pairs are 26
lines apart, like the excerpt copied above.


Any insight you can provide would be greatly appreciated.

Thanks,
John



 I'm running this on CentOS 6; both 8.4 and 9.2 are installed from the
Yum repository. PostgreSQL is primarily used for GIS data and has ESRI
st_geometry and PostGIS installed in several of the databases. (ESRI's
support is only up to 9.2, which is why I'm not attempting a move to
9.3.) The interesting thing with this error is that when I wipe out the
9.2 data directory, re-initdb, and run the upgrade again, I now get a
different error:

 CREATE TABLESPACE sde1 OWNER sde LOCATION '/disk2/pgsql/data/sde';
 psql:pg_upgrade_dump_globals.sql:294: ERROR:  directory
/disk2/pgsql/data/sde/PG_9.2_201204301 already in use as a tablespace

I think that's pilot error: you forgot to clean out tablespace directories
along with the main data directory.

   regards, tom lane




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription

Re: [GENERAL] duplicate OID issue when using pg_upgrade to move from 8.4 to 9.2

2014-01-01 Thread Reiser, John J.


On 1/1/14, 3:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Reiser, John J. rei...@rowan.edu writes:
 On 1/1/14, 12:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 What this smells like is a bug in the pg_dump --binary_upgrade logic
that
 tries to preserve type OIDs from the old installation to the new one.
 Is there a preceding CREATE TYPE command for st_envelope in the dump
 script?  Look for calls to binary_upgrade.set_next_pg_type_oid() and
 binary_upgrade.set_next_array_pg_type_oid() in the dump script --- are
 there conflicting entries?  Also, exactly what is type 1407909 in the
 old installation (try select * from pg_type where oid = 1407909)?

 Once I got 8.4 back up, I searched for that OID in pg_type. select *
from
 pg_type where oid = 1407909; returns 0 rows.

Hm, which database(s) did you check in?  It certainly appears from the
dump text you quote that type spheroid has OID 1407909 in at least one
database.

 I did find this, searching through pg_upgrade_restore.log. There are 8
 instances of the following text in the file:

If I'm reading you right, then these must be instances of the same type
with the same OID declared in different databases.  Could you look through
the dump for \connect commands to verify that?

 SELECT binary_upgrade.set_next_pg_type_oid('1407909'::pg_catalog.oid);
 occurs 22 times on lines (1150, 1176, 44192, 44218, 64149, 64175, 71815,
 71841, 79844, 79870, 88982, 89008, 97153, 97179, 106523, 106549, 289254,
 289280, 297653, 297679, 310824, 310850) and all the close pairs are 26
 lines apart, like the excerpt copied above.

Could you look at the text surrounding these places to determine which
types this OID is being selected for?  Each of these calls should be just
preceding a CREATE TYPE command (with maybe a set_next_array_pg_type_oid
call between) that is supposed to use the specified OID for its type.
Also identify which databases the commands are being issued in, by looking
back for the most recent \connect command.

Also, is there any CREATE TYPE for st_envelope preceding the failing
CREATE FUNCTION command (in the same database)?

   regards, tom lane

Tom, 

Thanks for the info. After searching the output for the connection string,
I found that it's failing on a database that can be archived. I think I'll
get what I need from the database, drop it, then perform the upgrade.

Thank you again for all of your help. It's greatly appreciated!

John




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Errors regarding non-existent files in pg_subtrans

2013-11-11 Thread J Smith

G'day list.

I've recently upgraded a number of servers from PostgreSQL 9.2.5 to
9.3.1 and have started getting the following errors every couple of
hours along with some failed transactions. I have been unable to track
down any sort of rhyme or reason for the errors yet, so I figured I'd
check with the mailing list to see if I've potentially hit into a
potential bug. The errors look like the following:

ERROR could not access status of transaction 4179979
Could not open file pg_subtrans/003F: No such file or directory.

Indeed, pg_subtrans/003F does not exist on the file system.

I have found similar errors mentioned on the list before a number of
years ago with a thread starting here:

http://www.postgresql.org/message-id/2009150225.076c2...@list.ru

A few details:

- the new database cluster was created by dumping global settings
using pg_dumpall and then dumping each database individually. Several
of the databases within this cluster have PostGIS installed and I
wanted to do a full PostGIS upgrade, which necessitated individual
dumps.

- this is the only database in a cluster of 14 databases that is
exhibiting the problem and it is sporadic at best. This database is
fairly write-intensive, and has been up since November 6th. In that
time, we've had 17 such errors.

- the pg_subtrans files mentioned in the log file do not exist on the
file system.

- I have rebuilt the cluster several times, as well as cleaned out the
pg_statistic table and run `VACUUM FULL ANALYZE` on every database in
case that was affecting it based on some similar threads back from the
PostgreSQL 8.4 days, but there was no affect and the errors still
occur.

Anyone have any suggestions or ideas? Thankfully these are dev systems
so I have some room to experiment and can post some more details as
necessary.

The following are the log file entries from one of the errors. Some
details have been sanitized for privacy reasons, but the overall lines
themselves are accurate.

Nov 10 10:14:02 dev-server postgres[29835]: [4-1] user=dev,db=dev
ERROR:  could not access status of transaction 4179979
Nov 10 10:14:02 dev-server postgres[29835]: [4-2] user=dev,db=dev
DETAIL:  Could not open file pg_subtrans/003F: No such file or
directory.
Nov 10 10:14:02 dev-server postgres[29835]: [4-3] user=dev,db=dev
CONTEXT:  SQL statement SELECT 1 FROM ONLY
typhon.collection_batches x WHERE id OPERATOR(pg_catalog.=) $1
FOR KEY SHARE OF x
Nov 10 10:14:02 dev-server postgres[29835]: [4-4] user=dev,db=dev
STATEMENT:  insert into raw (url, id) values ($1, $2)
Nov 10 10:14:02 dev-server postgres[29839]: [4-1] user=dev,db=dev
ERROR:  could not access status of transaction 4179979
Nov 10 10:14:02 dev-server postgres[29839]: [4-2] user=dev,db=dev
DETAIL:  Could not open file pg_subtrans/003F: No such file or
directory.
Nov 10 10:14:02 dev-server postgres[29839]: [4-3] user=dev,db=dev
CONTEXT:  SQL statement SELECT 1 FROM ONLY
typhon.collection_batches x WHERE id OPERATOR(pg_catalog.=) $1
FOR KEY SHARE OF x
Nov 10 10:14:02 dev-server postgres[29839]: [4-4] user=dev,db=dev
STATEMENT:  update listings set value_a=$1 where id=$2
Nov 10 10:14:02 dev-server postgres[29827]: [4-1] user=dev,db=dev
ERROR:  could not access status of transaction 4179979
Nov 10 10:14:02 dev-server postgres[29827]: [4-2] user=dev,db=dev
DETAIL:  Could not open file pg_subtrans/003F: No such file or
directory.
Nov 10 10:14:02 dev-server postgres[29827]: [4-3] user=dev,db=dev
CONTEXT:  SQL statement SELECT 1 FROM ONLY
typhon.collection_batches x WHERE id OPERATOR(pg_catalog.=) $1
FOR KEY SHARE OF x

I can provide additional details like my postgresql.conf if it would
help, although it's fairly standard outside of tuning as provided by
pgtune.

Cheers and thanks, List.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] partitioned table + postgres_FDW not working in 9.3

2013-09-26 Thread Lonni J Friedman

Hi Shigeru,
Thanks for your reply.  This sounds like a relatively simple
workaround, so I'll give it a try.  Is the search_path of the remote
session that postgres_fdw forces considered to be intentional,
expected behavior, or is it a bug?

thanks!

On Wed, Sep 25, 2013 at 7:13 PM, Shigeru Hanada
shigeru.han...@gmail.com wrote:
 Hi Lonni,

 2013/9/25 Lonni J Friedman netll...@gmail.com:
 The problem that I'm experiencing is if I attempt to perform an INSERT
 on the foreign nppsmoke table on cluster a, it fails claiming that the
 table partition which should hold the data in the INSERT does not
 exist:

 ERROR:  relation nppsmoke_2013_09 does not exist
 CONTEXT:  Remote SQL command: INSERT INTO public.nppsmoke(id,
 date_created, last_update, build_type, current_status, info, cudacode,
 gpu, subtest, os, osversion, arch, cl, dispvers, branch, pass, fail,
 oldfail, newfail, failureslog, totdriver, ddcl, buildid, testdcmd,
 pclog, filtercount, filterlog, error) VALUES ($1, $2, $3, $4, $5, $6,
 $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20,
 $21, $22, $23, $24, $25, $26, $27, $28)
 PL/pgSQL function public.nppsmoke_insert_trigger() line 30 at SQL statement

 I could reproduce the problem.

 If I run the same exact SQL INSERT on cluster b (not using the foreign
 table), then it works.  So whatever is going wrong seems to be related
 to the foreign table.  Initially I thought that perhaps the problem
 was that I needed to create all of the partitions as foreign tables on
 cluster a, but that doesn't help.

 Am I hitting some kind of foreign data wrapper limitation, or am I
 doing something wrong?

 The cause of the problem is search_path setting of remote session.
 For some reasons, postgres_fdw forces the search_path on the remote
 side to be 'pg_catalog', so all objects used in the session
 established by postgres_fdw have to be schema-qualified.  Trigger
 function is executed in such context, so you need to qualify all
 objects in your trigger function with schema name, like
 'public.nppsmoke_2013_09'.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] partitioned table + postgres_FDW not working in 9.3

2013-09-26 Thread Lonni J Friedman

On Thu, Sep 26, 2013 at 8:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Lonni J Friedman netll...@gmail.com writes:
 Thanks for your reply.  This sounds like a relatively simple
 workaround, so I'll give it a try.  Is the search_path of the remote
 session that postgres_fdw forces considered to be intentional,
 expected behavior, or is it a bug?

 It's intentional.

 Possibly more to the point, don't you think your trigger function is
 rather fragile if it assumes the caller has provided a particular
 search path setting?

To be honest, I don't have much experience with functions, and was
using the trigger function from the official documentation:
http://www.postgresql.org/docs/9.3/static/ddl-partitioning.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] pg_basebackup: ERROR: could not find any WAL files (9.3)

2013-09-26 Thread Lonni J Friedman

Greetings,
I've recently pushed a new postgres-9.3 (Linux-x86_64/RHEL6) cluster
into production, with one master, and two hot standby streaming
replication slaves.  Everything seems to be working ok, however
roughly half of my pg_basebackup attempts are failing at the very end
with the error:

pg_basebackup: could not get transaction log end position from server:
ERROR:  could not find any WAL files

I should note that I'm running pg_basebackup on one of the two slaves,
and not the master.  However, I've got an older, separate 9.3 cluster
with the same setup, and pg_basebackup never fails there.

I thought that the WAL files in question were coming from the pg_xlog
subdirectory.  But I don't see any lack of files there on the server
running pg_basebackup.  They are being generated continuously (as
expected), before, during  after the pg_basebackup.  I scanned the
source ( http://doxygen.postgresql.org/basebackup_8c_source.html ),
and it seems to backup my understanding of the expected behavior:

306  /*
307  * There must be at least one xlog file in the pg_xlog directory,
308  * since we are doing backup-including-xlog.
309  */
310  if (nWalFiles  1)
311  ereport(ERROR,
312  (errmsg(could not find any WAL files)));

However, what I see on the server conflicts with the error.
pg_basebackup was invoked on Thu Sep 26 01:00:01 PDT 2013, and failed
on Thu Sep 26 02:09:12 PDT 2013.  In the pg_xlog subdirectory, I see
lots of WAL files present, before, during  after pg_basebackup was
run:
-rw--- 1 postgres postgres 16777216 Sep 26 00:38 0001208A00E3
-rw--- 1 postgres postgres 16777216 Sep 26 00:43 0001208A00E4
-rw--- 1 postgres postgres 16777216 Sep 26 00:48 0001208A00E5
-rw--- 1 postgres postgres 16777216 Sep 26 00:53 0001208A00E6
-rw--- 1 postgres postgres 16777216 Sep 26 00:58 0001208A00E7
-rw--- 1 postgres postgres 16777216 Sep 26 01:03 0001208A00E8
-rw--- 1 postgres postgres 16777216 Sep 26 01:08 0001208A00E9
-rw--- 1 postgres postgres 16777216 Sep 26 01:14 0001208A00EA
-rw--- 1 postgres postgres 16777216 Sep 26 01:19 0001208A00EB
-rw--- 1 postgres postgres 16777216 Sep 26 01:24 0001208A00EC
-rw--- 1 postgres postgres 16777216 Sep 26 01:29 0001208A00ED
-rw--- 1 postgres postgres 16777216 Sep 26 01:34 0001208A00EE
-rw--- 1 postgres postgres 16777216 Sep 26 01:38 0001208A00EF
-rw--- 1 postgres postgres 16777216 Sep 26 01:43 0001208A00F0
-rw--- 1 postgres postgres 16777216 Sep 26 01:48 0001208A00F1
-rw--- 1 postgres postgres 16777216 Sep 26 01:53 0001208A00F2
-rw--- 1 postgres postgres 16777216 Sep 26 01:58 0001208A00F3
-rw--- 1 postgres postgres 16777216 Sep 26 02:03 0001208A00F4
-rw--- 1 postgres postgres 16777216 Sep 26 02:08 0001208A00F5
-rw--- 1 postgres postgres 16777216 Sep 26 02:14 0001208A00F6


Thanks in advance for any pointers.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] postgres FDW doesn't support sequences?

2013-09-25 Thread Lonni J Friedman

I've got two 9.3 clusters, with a postgres foreign data wrapper (FDW)
setup to point from one cluster to the other.  One of the (foreign)
tables associated with the foreign server has a bigint sequence for
its primary key, defined as:

id | bigint  | not null default
nextval('nppsmoke_id_seq1'::regclass)


If I INSERT a new row into the local table (not the foreign table
version), without specifying the 'id' column explicitly, it
automatically is assigned the nextval in the sequence counter.
However, if I attempt to run the same INSERT using the foreign table,
it always fails complaining that null value in column id violates
not-null constraint.  It seems like the FDW is somehow ignoring the
existence of the sequence default value, and rewriting the SQL query
to explicitly attempt to insert a NULL value.  Here's the full query 
resulting error output:

nightly=# INSERT into nppsmoke
(date_created,last_update,build_type,current_status,info,cudacode,gpu,subtest,os,arch,cl,dispvers,branch,totdriver,ddcl,testdcmd,osversion)
VALUES 
((date_trunc('second',now())),(date_trunc('second',now())),'release','Building','npp-release-gpu-buildCUDA-2013-09-24-1380041350.log','2013-09-24.cuda-linux64-test42.release.gpu','380','CUDA
build','Linux','x86_64','16935289','CBS_cuda_a_2013-09-24_16935289','cuda_a','1','16935289','./npp-smoke.sh
--testtype release --amodel f --vc g --drvpath
/home/lfriedman/cuda-stuff/sw/dev/gpu_drv/cuda_a/drivers/gpgpu
--cudaroot /home/lfriedman/cuda-stuff/sw/gpgpu --totdriver t  --email
lfriedman','2.6.32-358.el6.x86_64');
ERROR:  null value in column id violates not-null constraint
DETAIL:  Failing row contains (null, 2013-09-25 08:00:46, 2013-09-25
08:00:46, release, Building,
npp-release-gpu-buildCUDA-2013-09-24-1380041350.log,
2013-09-24.cuda-linux64-test42.release.gpu, 380, CUDA build, Linux,
2.6.32-358.el6.x86_64, x86_64, 16935289,
CBS_cuda_a_2013-09-24_16935289, cuda_a, null, null, null, null, null,
t, 16935289, null, ./npp-smoke.sh --testtype release --amodel f --vc g
--drvpath /h..., null, null, null, null, g).
CONTEXT:  Remote SQL command: INSERT INTO public.nppsmoke(id,
date_created, last_update, build_type, current_status, info, cudacode,
gpu, subtest, os, osversion, arch, cl, dispvers, branch, pass, fail,
oldfail, newfail, failureslog, totdriver, ddcl, buildid, testdcmd,
pclog, filtercount, filterlog, error) VALUES ($1, $2, $3, $4, $5, $6,
$7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20,
$21, $22, $23, $24, $25, $26, $27, $28)

I tried to recreate the foreign table definition with a primary key,
and that failed:
ERROR:  constraints are not supported on foreign tables

Are sequences supported with the postgres FDW?  If not, is there any
workaround for inserting into a foreign table that doesn't require me
to explicitly specify a value for the primary key sequence column in
my INSERT statements?

thanks!


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] postgres FDW doesn't support sequences?

2013-09-25 Thread Lonni J Friedman

On Wed, Sep 25, 2013 at 2:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Lonni J Friedman netll...@gmail.com writes:
 If I INSERT a new row into the local table (not the foreign table
 version), without specifying the 'id' column explicitly, it
 automatically is assigned the nextval in the sequence counter.
 However, if I attempt to run the same INSERT using the foreign table,
 it always fails complaining that null value in column id violates
 not-null constraint.  It seems like the FDW is somehow ignoring the
 existence of the sequence default value, and rewriting the SQL query
 to explicitly attempt to insert a NULL value.

 Yeah, there was quite a bit of discussion about that back in February or
 so.  The short of it is that column default values that are defined on the
 foreign server are not respected by operations on a foreign table; rather,
 you have to attach a DEFAULT specification to the foreign table definition
 if you want inserts into the foreign table to use that default.

 The default expression is executed locally, too, which means that if you'd
 like it to read like nextval('some_seq') then some_seq has to be a local
 sequence, not one on the foreign server.

Is there some elegant mechanism for keeping the local  foreign
sequences in sync?


 I realize that this isn't ideal for serial-like columns, but honoring
 default expressions that would execute on the foreign server turned out
 to be a huge can of worms.  We might figure out how to fix that some day;
 but if we'd insisted on a solution now, there wouldn't be writable foreign
 tables at all in 9.3.

Understood.  Other than reading the code, is there somewhere that
these limitations are documented that I overlooked?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] partitioned table + postgres_FDW not working in 9.3

2013-09-24 Thread Lonni J Friedman

Greetings,
I've got two different 9.3 clusters setup, a  b (on Linux if that
matters).  On cluster b, I have a table (nppsmoke) that is partitioned
by date (month), which uses a function which is called by a trigger to
manage INSERTS (exactly as documented in the official documentation
for partitioning of tables).  I've setup a postgres foreign data
wrapper server on cluster a which points to cluster b, and then setup
a foreign table (nppsmoke) on cluster a which points to the actual
partitioned (nppsmoke) table on cluster b.  The partitions on cluster
b use the naming scheme nppsmoke_$_$MM (where Y=4 digit year,
and M=2 digit month).  For example, the current month's partition is
named nppsmoke_2013_09 .

The problem that I'm experiencing is if I attempt to perform an INSERT
on the foreign nppsmoke table on cluster a, it fails claiming that the
table partition which should hold the data in the INSERT does not
exist:

ERROR:  relation nppsmoke_2013_09 does not exist
CONTEXT:  Remote SQL command: INSERT INTO public.nppsmoke(id,
date_created, last_update, build_type, current_status, info, cudacode,
gpu, subtest, os, osversion, arch, cl, dispvers, branch, pass, fail,
oldfail, newfail, failureslog, totdriver, ddcl, buildid, testdcmd,
pclog, filtercount, filterlog, error) VALUES ($1, $2, $3, $4, $5, $6,
$7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20,
$21, $22, $23, $24, $25, $26, $27, $28)
PL/pgSQL function public.nppsmoke_insert_trigger() line 30 at SQL statement

If I run the same exact SQL INSERT on cluster b (not using the foreign
table), then it works.  So whatever is going wrong seems to be related
to the foreign table.  Initially I thought that perhaps the problem
was that I needed to create all of the partitions as foreign tables on
cluster a, but that doesn't help.

Am I hitting some kind of foreign data wrapper limitation, or am I
doing something wrong?

thanks


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] upgrade from 9.2.x to 9.3 causes significant performance degradation

2013-09-18 Thread Lonni J Friedman

On Wed, Sep 18, 2013 at 2:02 AM, Kevin Grittner kgri...@ymail.com wrote:
 Lonni J Friedman netll...@gmail.com wrote:

 top shows over 90% of the load is in sys space.  vmstat output
 seems to suggest that its CPU bound (or bouncing back  forth):

 Can you run `perf top` during an episode and see what kernel
 functions are using all that CPU?

Oddly, the problem went away on its own yesterday just after 4PM, and
performance has remained 'normal' since that time.  I changed
absolutely nothing.  If/when it returns, I'll certainly capture that
output.


 This looks similar to cases I've seen of THP defrag going wild.
 Did the OS version or configuration change?  Did the PostgreSQL
 memory settings (like shared_buffers) change?

Nothing changed other than the version of postgres.  I re-used the
same postgresql.conf that was in place when running 9.2.x.

Anyway, here are the current THP related settings on the server:
[root@cuda-db7 ~]# grep AnonHugePages /proc/meminfo
AnonHugePages:548864 kB
[root@cuda-db7 ~]# egrep 'trans|thp' /proc/vmstat
nr_anon_transparent_hugepages 272
thp_fault_alloc 129173889
thp_fault_fallback 17462551
thp_collapse_alloc 148437
thp_collapse_alloc_failed 15143
thp_split 242


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] upgrade from 9.2.x to 9.3 causes significant performance degradation

2013-09-18 Thread Lonni J Friedman

On Wed, Sep 18, 2013 at 2:02 AM, Kevin Grittner kgri...@ymail.com wrote:
 Lonni J Friedman netll...@gmail.com wrote:

 top shows over 90% of the load is in sys space.  vmstat output
 seems to suggest that its CPU bound (or bouncing back  forth):

 Can you run `perf top` during an episode and see what kernel
 functions are using all that CPU?

I take back what I said earlier.  While the master is currently back
to normal performance, the two hot standby slaves are still churning
something awful.

If I run 'perf top' on either slave, after a few seconds, these are
consistently the top three in the list:
 84.57%  [kernel]   [k] _spin_lock_irqsave
  6.21%  [unknown]  [.] 0x00659f60
  4.69%  [kernel]   [k] compaction_alloc


 This looks similar to cases I've seen of THP defrag going wild.
 Did the OS version or configuration change?  Did the PostgreSQL
 memory settings (like shared_buffers) change?

I think you're onto something here with respect to THP defrag going
wild.  I set /sys/kernel/mm/transparent_hugepage/defrag to 'never' and
immediately the load dropped on both slaves from over 5.00 to under
1.00.

So this raises the question, is this a kernel bug, or is there some
other solution to the problem?
Also, seems weird that the problem didn't happen until I switched from
9.2 to 9.3.  Is it possible this is somehow related to the change from
using SysV shared memory to using Posix shared memory and mmap for
memory management?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] upgrade from 9.2.x to 9.3 causes significant performance degradation

2013-09-17 Thread Lonni J Friedman

Greetings,
I'm running a PostgreSQL 9.3.0 cluster (1 master with two streaming
replication hot standby slaves) on RHEL6-x86_64.  Yesterday I upgraded
from 9.2.4 to 9.3.0, and since the upgrade I'm seeing a significant
performance degradation.  PostgreSQL simply feels slower.  Nothing
other than the version of PostgreSQL changed yesterday.  I used
pg_upgrade to perform the upgrade, and ran the generated
analyze_new_cluster.sh immediately afterwards, which completed
successfully.

Prior to the upgrade, I'd generally expect a load average of less than
2.00 on the master, and less than 1.00 on each of the slaves.  Since
the upgrade, the load average on the master has been in double digits
(hitting 100.00 for a few minutes), and the slaves are consistently
above 5.00.

There are a few things that are jumping out at me as behaving
differently since the upgrade.  vmstat processes waiting for runtime
counts have increased dramatically.  Prior to the upgrade the process
count would be consistently less than 10, however since upgrading it
hovers between 40  60 at all times.  /proc/interrupts Local timer
interrupts has increased dramatically as well.  It used to hover
around 6000 and is now over 20k much of the time.  However, I'm
starting to suspect that they are both symptoms of the problem rather
than the cause.

At this point, I'm looking for guidance on how to debug this problem
more effectively.

thanks


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] upgrade from 9.2.x to 9.3 causes significant performance degradation

2013-09-17 Thread Lonni J Friedman

On Tue, Sep 17, 2013 at 9:54 AM, Eduardo Morras emorr...@yahoo.es wrote:
 On Tue, 17 Sep 2013 09:19:29 -0700
 Lonni J Friedman netll...@gmail.com wrote:

 Greetings,
 I'm running a PostgreSQL 9.3.0 cluster (1 master with two streaming
 replication hot standby slaves) on RHEL6-x86_64.  Yesterday I upgraded
 from 9.2.4 to 9.3.0, and since the upgrade I'm seeing a significant
 performance degradation.  PostgreSQL simply feels slower.  Nothing
 other than the version of PostgreSQL changed yesterday.  I used
 pg_upgrade to perform the upgrade, and ran the generated
 analyze_new_cluster.sh immediately afterwards, which completed
 successfully.

 Prior to the upgrade, I'd generally expect a load average of less than
 2.00 on the master, and less than 1.00 on each of the slaves.  Since
 the upgrade, the load average on the master has been in double digits
 (hitting 100.00 for a few minutes), and the slaves are consistently
 above 5.00.

 There are a few things that are jumping out at me as behaving
 differently since the upgrade.  vmstat processes waiting for runtime
 counts have increased dramatically.  Prior to the upgrade the process
 count would be consistently less than 10, however since upgrading it
 hovers between 40  60 at all times.  /proc/interrupts Local timer
 interrupts has increased dramatically as well.  It used to hover
 around 6000 and is now over 20k much of the time.  However, I'm
 starting to suspect that they are both symptoms of the problem rather
 than the cause.

 At this point, I'm looking for guidance on how to debug this problem
 more effectively.

 Don't know what happens but:

 a) Does analyze_new_cluster.sh include a reindex? If not, indexs are useless 
 because analyze statistics says so.

No, it doesn't include a reindex.  It merely invokes vacuumdb --all
--analyze-only with different values for default_statistics_target=1
-c vacuum_cost_delay=0.

According to the documentation for pg_upgrade, post-upgrade scripts to
rebuild tables and indexes will be generated automatically.  Nothing
was generated for this purpose, at least not in any obvious place.
The analyze_new_cluster.sh script is the only one that was
automatically generated as far as I can tell.

 b) Did you configure postgresql.conf on 9.3.0 for your server/load? Perhaps 
 it has default install values.

Yes, I'm using the same postgresql.conf as I was using when running
9.2.4.  Its definitely not running with default install values.

 c) What does logs say?

The postgres server logs look perfectly normal, minus a non-trivial
slower run time for most queries.  There's nothing unusual in any of
the OS level logs (/var/log/messages, etc) or dmesg.



Do you have any other suggestions?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] upgrade from 9.2.x to 9.3 causes significant performance degradation

2013-09-17 Thread Lonni J Friedman

Thanks for your reply.  Comments/answers inline below

On Tue, Sep 17, 2013 at 11:28 AM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Tue, Sep 17, 2013 at 11:22 AM, Lonni J Friedman netll...@gmail.com
 wrote:


  c) What does logs say?

 The postgres server logs look perfectly normal, minus a non-trivial
 slower run time for most queries.  There's nothing unusual in any of
 the OS level logs (/var/log/messages, etc) or dmesg.


 Are you generally CPU limited or IO limited?

top shows over 90% of the load is in sys space.  vmstat output seems
to suggest that its CPU bound (or bouncing back  forth):

procs ---memory-- ---swap-- -io --system-- -cpu-
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa st
 1  0  17308 852016 141104 12707419200101800
6  4 90  0  0
 0  0  17308 872316 141104 12707420000 0   988  940  564
1  0 99  0  0
 0  0  17308 884288 141104 12707420800 0  1921 1202 2132
1  0 99  0  0
 0  0  17308 898728 141104 12707420800 0 0 1064  577
1  0 99  0  0
 2  0  17308 914920 141104 12707422400 044  820  427
1  0 99  0  0
 0  0  17308 926524 141104 12707427200 048 1173  585
1  0 99  0  0
108  1  17308 753648 141104 12707422400 0   236 9825 3901
12  5 83  0  0
50  0  17308 723156 141104 12707440000 0   144 43481 9105
20 79  1  0  0
45  0  17308 722860 141104 12707441600 0 8 32969 1998
1 97  2  0  0
47  0  17308 738996 141104 12707441600 0 0 34099 1739
1 99  0  0  0
101  0  17308 770220 141104 12707448000 032 38550 5998
 7 93  0  0  0
101  0  17308 775732 141104 12707451200 0   156 33889 5809
 4 96  0  0  0
99  0  17308 791232 141104 12707454400 0 0 32385 4981
0 100  0  0  0
96  0  17308 803156 141104 12707454400 024 32413 4824
0 100  0  0  0
87  0  17308 811624 141104 12707454400 0 0 32438 4470
0 100  0  0  0
83  0  17308 815500 141104 12707454400 0 0 32489 4159
0 100  0  0  0
80  0  17308 826572 141104 12707455200 033 32582 3948
0 100  0  0  0
73  0  17308 853264 141108 12707455200 052 32833 3840
0 100  0  0  0
73  0  17308 882240 141108 12707456000 0 4 32820 3594
0 100  0  0  0
72  0  17308 892256 141108 12707456000 0 0 32368 3516
0 100  0  0  0
###

iostat consistently shows %util under 1.00 which also suggests that
disk IO is not the bottleneck:
#
iostat -dx /dev/sdb 5
Linux 2.6.32-358.6.2.el6.x86_64 (cuda-db7) 09/17/2013 _x86_64_
   (32 CPU)

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sdb   0.02 0.216.91   31.33   651.60  1121.85
46.38 0.092.25   0.08   0.31

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sdb   0.00 0.000.008.00 0.0093.00
11.62 0.000.28   0.20   0.16

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sdb   0.00 0.000.00   11.00 0.00   125.40
11.40 0.000.16   0.16   0.18

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sdb   0.00 0.000.00  105.00 0.00  3380.40
32.19 0.292.76   0.03   0.34

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sdb   0.00 0.000.00   14.80 0.00  2430.60
164.23 0.000.12   0.09   0.14

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sdb   0.00 1.200.00   41.60 0.00  1819.40
43.74 0.020.45   0.05   0.20

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sdb   0.00 0.000.002.80 0.0032.00
11.43 0.000.00   0.00   0.00
#

mpstat also shows a virtually 0 iowait, with a ton of sys (CPU) time:
#
mpstat 2 10
Linux 2.6.32-358.6.2.el6.x86_64 (cuda-db7) 09/17/2013 _x86_64_
   (32 CPU)

12:53:19 PM  CPU%usr   %nice%sys %iowait%irq   %soft
%steal  %guest   %idle
12:53:21 PM  all7.360.00   92.580.000.000.03
0.000.000.03
12:53:23 PM  all6.350.00   90.430.000.000.03
0.000.003.19
12:53:25 PM  all3.130.00   68.200.000.000.02
0.000.00   28.66
12:53:27 PM  all6.070.00   68.460.000.000.03
0.000.00   25.44
12:53:29 PM  all5.830.00   94.140.000.000.03
0.000.000.00
12:53:31 PM  all5.750.00   94.140.000.000.11
0.000.000.00
12:53:33 PM  all7.650.00   40.32

Re: [GENERAL] upgrade from 9.2.x to 9.3 causes significant performance degradation

2013-09-17 Thread Lonni J Friedman

On Tue, Sep 17, 2013 at 3:47 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2013-09-17 09:19:29 -0700, Lonni J Friedman wrote:
 I'm running a PostgreSQL 9.3.0 cluster (1 master with two streaming
 replication hot standby slaves) on RHEL6-x86_64.  Yesterday I upgraded
 from 9.2.4 to 9.3.0, and since the upgrade I'm seeing a significant
 performance degradation.  PostgreSQL simply feels slower.  Nothing
 other than the version of PostgreSQL changed yesterday.  I used
 pg_upgrade to perform the upgrade, and ran the generated
 analyze_new_cluster.sh immediately afterwards, which completed
 successfully.

 Where did you get 9.3.0 from? Compiled it yourself? Any chance you
 compile with --enable-cassert or somesuch?

Directly from http://yum.postgresql.org.  So unless the RPMs on there
are built weird/wrong, I don't think that's the problem.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] WAL Replication Working but Not Working

2013-08-21 Thread Lonni J Friedman

The first thing to do is look at your server logs around the time when
it stopped working.

On Wed, Aug 21, 2013 at 7:08 AM, Joseph Marlin jmar...@saucontech.com wrote:
 We're having an issue with our warm standby server. About 9:30 last night, it 
 stopped applying changes it received in WAL files that are shipped over to it 
 as they are created. It is still reading WAL files as they delivered, as the 
 startup_log.txt shows, but the changes in the primary database aren't 
 actually being made to the standby, and haven't been since last night.

 Is there any way we can figure out what is going on here? We'd like to 
 recover somehow without having to restore from a base backup, and we'd like 
 to figure out what is happening so we can prevent it in the future.

 Thanks!


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Streaming Replication Randomly Locking Up

2013-08-15 Thread Lonni J Friedman

I've never seen this happen.  Looks like you might be using 9.1?  Are
you up to date on all the 9.1.x releases?

Do you have just 1 slave syncing from the master?
Which OS are you using?
Did you verify that there aren't any network problems between the
slave  master?
Or hardware problems (like the NIC dying, or dropping packets)?


On Thu, Aug 15, 2013 at 11:07 AM, Andrew Berman rexx...@gmail.com wrote:
 Hello,

 I'm having an issue where streaming replication just randomly stops working.
 I haven't been able to find anything in the logs which point to an issue,
 but the Postgres process shows a waiting status on the slave:

 postgres  5639  0.1 24.3 3428264 2970236 ? Ss   Aug14   1:54 postgres:
 startup process   recovering 0001053D003F waiting
 postgres  5642  0.0 21.4 3428356 2613252 ? Ss   Aug14   0:30 postgres:
 writer process
 postgres  5659  0.0  0.0 177524   788 ?Ss   Aug14   0:03 postgres:
 stats collector process
 postgres  7159  1.2  0.1 3451360 18352 ?   Ss   Aug14  17:31 postgres:
 wal receiver process   streaming 549/216B3730

 The replication works great for days, but randomly seems to lock up and
 replication halts.  I verified that the two databases were out of sync with
 a query on both of them.  Has anyone experienced this issue before?

 Here are some relevant config settings:

 Master:

 wal_level = hot_standby
 checkpoint_segments = 32
 checkpoint_completion_target = 0.9
 archive_mode = on
 archive_command = 'rsync -a %p foo@foo:/var/lib/pgsql/9.1/wals/%f
 /dev/null'
 max_wal_senders = 2
 wal_keep_segments = 32

 Slave:

 wal_level = hot_standby
 checkpoint_segments = 32
 #checkpoint_completion_target = 0.5
 hot_standby = on
 max_standby_archive_delay = -1
 max_standby_streaming_delay = -1
 #wal_receiver_status_interval = 10s
 #hot_standby_feedback = off

 Thank you for any help you can provide!

 Andrew




-- 
~
L. Friedmannetll...@gmail.com
LlamaLand   https://netllama.linux-sxs.org


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Streaming Replication Randomly Locking Up

2013-08-15 Thread Lonni J Friedman

Are you certain that there are no relevant errors in the database logs
(on both master  slave)?  Also, are you sure that you didn't
misconfigure logging such that errors wouldn't appear?

On Thu, Aug 15, 2013 at 11:45 AM, Andrew Berman rexx...@gmail.com wrote:
 Hi Lonni,

 Yes, I am using PG 9.1.9.
 Yes, 1 slave syncing from the master
 CentOS 6.4
 I don't see any network or hardware issues (e.g. NIC) but will look more
 into this.  They are communicating on a private network and switch.

 I forgot to mention that after I restart the slave, everything syncs right
 back up and all if working again so if it is a network issue, the
 replication is just stopping after some hiccup instead of retrying and
 resuming when things are back up.

 Thanks!



 On Thu, Aug 15, 2013 at 11:32 AM, Lonni J Friedman netll...@gmail.com
 wrote:

 I've never seen this happen.  Looks like you might be using 9.1?  Are
 you up to date on all the 9.1.x releases?

 Do you have just 1 slave syncing from the master?
 Which OS are you using?
 Did you verify that there aren't any network problems between the
 slave  master?
 Or hardware problems (like the NIC dying, or dropping packets)?


 On Thu, Aug 15, 2013 at 11:07 AM, Andrew Berman rexx...@gmail.com wrote:
  Hello,
 
  I'm having an issue where streaming replication just randomly stops
  working.
  I haven't been able to find anything in the logs which point to an
  issue,
  but the Postgres process shows a waiting status on the slave:
 
  postgres  5639  0.1 24.3 3428264 2970236 ? Ss   Aug14   1:54
  postgres:
  startup process   recovering 0001053D003F waiting
  postgres  5642  0.0 21.4 3428356 2613252 ? Ss   Aug14   0:30
  postgres:
  writer process
  postgres  5659  0.0  0.0 177524   788 ?Ss   Aug14   0:03
  postgres:
  stats collector process
  postgres  7159  1.2  0.1 3451360 18352 ?   Ss   Aug14  17:31
  postgres:
  wal receiver process   streaming 549/216B3730
 
  The replication works great for days, but randomly seems to lock up and
  replication halts.  I verified that the two databases were out of sync
  with
  a query on both of them.  Has anyone experienced this issue before?
 
  Here are some relevant config settings:
 
  Master:
 
  wal_level = hot_standby
  checkpoint_segments = 32
  checkpoint_completion_target = 0.9
  archive_mode = on
  archive_command = 'rsync -a %p foo@foo:/var/lib/pgsql/9.1/wals/%f
  /dev/null'
  max_wal_senders = 2
  wal_keep_segments = 32
 
  Slave:
 
  wal_level = hot_standby
  checkpoint_segments = 32
  #checkpoint_completion_target = 0.5
  hot_standby = on
  max_standby_archive_delay = -1
  max_standby_streaming_delay = -1
  #wal_receiver_status_interval = 10s
  #hot_standby_feedback = off
 
  Thank you for any help you can provide!
 
  Andrew
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Streaming Replication Randomly Locking Up

2013-08-15 Thread Lonni J Friedman

I'd suggest enhancing your logging to include time/datestamps for
every entry, and also the client hostname.  That will help to rule
in/out those 'unexpected EOF' errors.

On Thu, Aug 15, 2013 at 12:22 PM, Andrew Berman rexx...@gmail.com wrote:
 The only thing I see that is a possibility for the issue is in the slave
 log:

 LOG:  unexpected EOF on client connection
 LOG:  could not receive data from client: Connection reset by peer

 I don't know if that's related or not as it could just be somebody running a
 query.  The log file does seem to be riddled with these but the replication
 failures don't happen constantly.

 As far as I know I'm not swallowing any errors.  The logging is all set as
 the default:

 log_destination = 'stderr'
 logging_collector = on
 #client_min_messages = notice
 #log_min_messages = warning
 #log_min_error_statement = error
 #log_min_duration_statement = -1
 #log_checkpoints = off
 #log_connections = off
 #log_disconnections = off
 #log_error_verbosity = default

 I'm going to have a look at the NICs to make sure there's no issue there.

 Thanks again for your help!


 On Thu, Aug 15, 2013 at 11:51 AM, Lonni J Friedman netll...@gmail.com
 wrote:

 Are you certain that there are no relevant errors in the database logs
 (on both master  slave)?  Also, are you sure that you didn't
 misconfigure logging such that errors wouldn't appear?

 On Thu, Aug 15, 2013 at 11:45 AM, Andrew Berman rexx...@gmail.com wrote:
  Hi Lonni,
 
  Yes, I am using PG 9.1.9.
  Yes, 1 slave syncing from the master
  CentOS 6.4
  I don't see any network or hardware issues (e.g. NIC) but will look more
  into this.  They are communicating on a private network and switch.
 
  I forgot to mention that after I restart the slave, everything syncs
  right
  back up and all if working again so if it is a network issue, the
  replication is just stopping after some hiccup instead of retrying and
  resuming when things are back up.
 
  Thanks!
 
 
 
  On Thu, Aug 15, 2013 at 11:32 AM, Lonni J Friedman netll...@gmail.com
  wrote:
 
  I've never seen this happen.  Looks like you might be using 9.1?  Are
  you up to date on all the 9.1.x releases?
 
  Do you have just 1 slave syncing from the master?
  Which OS are you using?
  Did you verify that there aren't any network problems between the
  slave  master?
  Or hardware problems (like the NIC dying, or dropping packets)?
 
 
  On Thu, Aug 15, 2013 at 11:07 AM, Andrew Berman rexx...@gmail.com
  wrote:
   Hello,
  
   I'm having an issue where streaming replication just randomly stops
   working.
   I haven't been able to find anything in the logs which point to an
   issue,
   but the Postgres process shows a waiting status on the slave:
  
   postgres  5639  0.1 24.3 3428264 2970236 ? Ss   Aug14   1:54
   postgres:
   startup process   recovering 0001053D003F waiting
   postgres  5642  0.0 21.4 3428356 2613252 ? Ss   Aug14   0:30
   postgres:
   writer process
   postgres  5659  0.0  0.0 177524   788 ?Ss   Aug14   0:03
   postgres:
   stats collector process
   postgres  7159  1.2  0.1 3451360 18352 ?   Ss   Aug14  17:31
   postgres:
   wal receiver process   streaming 549/216B3730
  
   The replication works great for days, but randomly seems to lock up
   and
   replication halts.  I verified that the two databases were out of
   sync
   with
   a query on both of them.  Has anyone experienced this issue before?
  
   Here are some relevant config settings:
  
   Master:
  
   wal_level = hot_standby
   checkpoint_segments = 32
   checkpoint_completion_target = 0.9
   archive_mode = on
   archive_command = 'rsync -a %p foo@foo:/var/lib/pgsql/9.1/wals/%f
   /dev/null'
   max_wal_senders = 2
   wal_keep_segments = 32
  
   Slave:
  
   wal_level = hot_standby
   checkpoint_segments = 32
   #checkpoint_completion_target = 0.5
   hot_standby = on
   max_standby_archive_delay = -1
   max_standby_streaming_delay = -1
   #wal_receiver_status_interval = 10s
   #hot_standby_feedback = off
  
   Thank you for any help you can provide!
  
   Andrew
  





-- 
~
L. Friedmannetll...@gmail.com
LlamaLand   https://netllama.linux-sxs.org


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] postgres FDW cost estimation options unrecognized in 9.3-beta1

2013-07-26 Thread Lonni J Friedman

Greetings,
I have a postgresql-9.3-beta1 cluster setup (from the
yum.postgresql.org RPMs), where I'm experimenting with the postgres
FDW extension.  The documentation (
http://www.postgresql.org/docs/9.3/static/postgres-fdw.html )
references three Cost Estimation Options which can be set for a
foreign table or a foreign server.  However when I attempt to set
them, I always get an error that the option is not found:
###
nightly=# show SERVER_VERSION ;
 server_version

 9.3beta1

nightly=# \des+
   List of
foreign servers
   Name|   Owner   | Foreign-data wrapper | Access privileges |
Type | Version |
  FDW Options| Description
---+---+--+---+--+-+--
-+-
 cuda_db10 | lfriedman | postgres_fdw |   |
  | | (host 'cuda-db10', dbname 'nightly', port '5432') |
(1 row)

nightly=# ALTER SERVER cuda_db10 OPTIONS (SET use_remote_estimate 'true') ;
ERROR:  option use_remote_estimate not found
###

Am I doing something wrong, or is this a bug?

thanks


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] postgres FDW cost estimation options unrecognized in 9.3-beta1

2013-07-26 Thread Lonni J Friedman

On Fri, Jul 26, 2013 at 3:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Lonni J Friedman netll...@gmail.com writes:
 nightly=# ALTER SERVER cuda_db10 OPTIONS (SET use_remote_estimate 'true') ;
 ERROR:  option use_remote_estimate not found

 Am I doing something wrong, or is this a bug?

 [ experiments... ]  You need to say ADD, not SET, to add a new option to
 the list.  SET might more appropriately be spelled REPLACE, because it
 requires that the object already have a defined value for the option,
 which will be replaced.

 Our documentation appears not to disclose this fine point, but a look
 at the SQL-MED standard says it's operating per spec.  The standard also
 says that ADD is an error if the option is already defined, which is a
 bit more defensible, but still not exactly what I'd call user-friendly.
 And the error we issue for that case is pretty misleading too:

 regression=# ALTER SERVER cuda_db10 OPTIONS (use_remote_estimate 'true') ;
 ALTER SERVER
 regression=# ALTER SERVER cuda_db10 OPTIONS (use_remote_estimate 'false') ;
 ERROR:  option use_remote_estimate provided more than once

 I think we could do with both more documentation, and better error
 messages for these cases.  In the SET-where-you-should-use-ADD case,
 perhaps

 ERROR:  option use_remote_estimate has not been set
 HINT: Use ADD not SET to define an option that wasn't already set.

 In the ADD-where-you-should-use-SET case, perhaps

 ERROR:  option use_remote_estimate is already set
 HINT: Use SET not ADD to change an option's value.

 The provided more than once wording would be appropriate if the same
 option is specified more than once in the command text, but I'm not sure
 that it's worth the trouble to detect that case.

 Thoughts, better wordings?

Thanks Tom, I've confirmed that using ADD was the solution.  I think
your suggested updated ERROR  HINT text is an excellent improvement.
It definitely would have given me the clue I was missing earlier.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] upgrading from 9.3-beta1 to 9.3-beta2 requires dump reload?

2013-07-24 Thread Lonni J Friedman

Greetings,
I just got around to upgrading from 9.3-beta1 to 9.3-beta2, and was
surprised to see that the server was refusing to start. In the log,
I'm seeing:

2013-07-24 13:41:47 PDT [7083]: [1-1] db=,user= FATAL:  database files
are incompatible with server
2013-07-24 13:41:47 PDT [7083]: [2-1] db=,user= DETAIL:  The database
cluster was initialized with CATALOG_VERSION_NO 201305061, but the
server was compiled with CATALOG_VERSION_NO 201306121.
2013-07-24 13:41:47 PDT [7083]: [3-1] db=,user= HINT:  It looks like
you need to initdb.


I'm using the RPMs from yum.postgresql.org on RHEL6.  Is this
expected, intentional behavior?  Do I really need to dump  reload to
upgrade between beta releases of 9.3, or is there some more efficient
way?

thanks


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] upgrading from 9.3-beta1 to 9.3-beta2 requires dump reload?

2013-07-24 Thread Lonni J Friedman

On Wed, Jul 24, 2013 at 2:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Lonni J Friedman escribió:
 I'm using the RPMs from yum.postgresql.org on RHEL6.  Is this
 expected, intentional behavior?  Do I really need to dump  reload to
 upgrade between beta releases of 9.3, or is there some more efficient
 way?

 We try to avoid forcing initdb between beta versions, but it's not
 guaranteed.  You should be able to use pg_upgrade, also.

 Unfortunately, the RPMs probably won't be very helpful for using
 pg_upgrade, since there's no convenient way to get beta1 and beta2
 postmaster executables installed at the same time (unless Devrim
 foresaw this case and packaged things differently than I did for
 Red Hat ;-)).

Sounds like I'm out of luck.  Thanks anyway.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Standby stopped working after PANIC: WAL contains references to invalid pages

2013-06-22 Thread Lonni J Friedman

Looks like some kind of data corruption.  Question is whether it came
from the master, or was created by the standby.  If you re-seed the
standby with a full (base) backup, does the problem go away?

On Sat, Jun 22, 2013 at 12:43 PM, Dan Kogan d...@iqtell.com wrote:
 Hello,



 Today our standby instance stopped working with this error in the log:



 2013-06-22 16:27:32 UTC [8367]: [247-1] [] WARNING:  page 158130 of relation
 pg_tblspc/16447/PG_9.2_201204301/16448/39154429 is uninitialized

 2013-06-22 16:27:32 UTC [8367]: [248-1] [] CONTEXT:  xlog redo vacuum: rel
 16447/16448/39154429; blk 158134, lastBlockVacuumed 158129

 2013-06-22 16:27:32 UTC [8367]: [249-1] [] PANIC:  WAL contains references
 to invalid pages

 2013-06-22 16:27:32 UTC [8367]: [250-1] [] CONTEXT:  xlog redo vacuum: rel
 16447/16448/39154429; blk 158134, lastBlockVacuumed 158129

 2013-06-22 16:27:32 UTC [8366]: [3-1] [] LOG:  startup process (PID 8367)
 was terminated by signal 6: Aborted

 2013-06-22 16:27:32 UTC [8366]: [4-1] [] LOG:  terminating any other active
 server processes



 After re-start the same exact error occurred.



 We thought that maybe we hit this bug -
 http://postgresql.1045698.n5.nabble.com/Completely-broken-replica-after-PANIC-WAL-contains-references-to-invalid-pages-td5750072.html.

 However, there is nothing in our log about sub-transactions, so it didn't
 seem the same to us.



 Any advice on how to further debug this so we can avoid this in the future
 is appreciated.



 Environment:



 AWS, High I/O instance (hi1.4xlarge), 60GB RAM



 Software and settings:



 PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro
 4.5.2-8ubuntu4) 4.5.2, 64-bit



 archive_command  rsync -a %p
 slave:/var/lib/postgresql/replication_load/%f

 archive_mode   on

 autovacuum_freeze_max_age 10

 autovacuum_max_workers6

 checkpoint_completion_target 0.9

 checkpoint_segments   128

 checkpoint_timeout   30min

 default_text_search_config   pg_catalog.english

 hot_standby  on

 lc_messages  en_US.UTF-8

 lc_monetary  en_US.UTF-8

 lc_numeric  en_US.UTF-8

 lc_time en_US.UTF-8

 listen_addresses  *

 log_checkpoints   on

 log_destinationstderr

 log_line_prefix %t [%p]: [%l-1] [%h]

 log_min_duration_statement-1

 log_min_error_statement   error

 log_min_messages error

 log_timezoneUTC

 maintenance_work_mem   1GB

 max_connections1200

 max_standby_streaming_delay90s

 max_wal_senders   5

 port   5432

 random_page_cost2

 seq_page_cost 1

 shared_buffers4GB

 ssl   off

 ssl_cert_file   /etc/ssl/certs/ssl-cert-snakeoil.pem

 ssl_key_file/etc/ssl/private/ssl-cert-snakeoil.key

 synchronous_commitoff

 TimeZoneUTC

 wal_keep_segments 128

 wal_level hot_standby

 work_mem8MB



 root@ip-10-148-131-236:~# /usr/local/pgsql/bin/pg_controldata
 /usr/local/pgsql/data

 pg_control version number:922

 Catalog version number:   201204301

 Database system identifier:   5838668587531239413

 Database cluster state:   in archive recovery

 pg_control last modified: Sat 22 Jun 2013 06:13:07 PM UTC

 Latest checkpoint location:   2250/18CA0790

 Prior checkpoint location:2250/18CA0790

 Latest checkpoint's REDO location:224F/E127B078

 Latest checkpoint's TimeLineID:   2

 Latest checkpoint's full_page_writes: on

 Latest checkpoint's NextXID:  1/2018629527

 Latest checkpoint's NextOID:  43086248

 Latest checkpoint's NextMultiXactId:  7088726

 Latest checkpoint's NextMultiOffset:  20617234

 Latest checkpoint's oldestXID:1690316999

 Latest checkpoint's oldestXID's DB:   16448

 Latest checkpoint's oldestActiveXID:  2018629527

 Time of latest checkpoint:Sat 22 Jun 2013 03:24:05 PM UTC

 Minimum recovery ending location: 2251/5EA631F0

 Backup start location:0/0

 Backup end location:  0/0

 End-of-backup record required:no

 Current wal_level setting:hot_standby

 Current max_connections setting:  1200

 Current max_prepared_xacts setting:   0

 Current max_locks_per_xact setting:   64

 Maximum data alignment:   8

 Database block size:  8192

 Blocks per segment of large relation: 131072

 WAL block size:   8192

 Bytes per WAL segment:16777216

 Maximum length of identifiers:64

 Maximum columns in an index:  32

 Maximum size of a TOAST chunk:1996

 Date/time type storage:   64-bit integers

 Float4 argument passing:  by value

 Float8 argument passing:  by value

 root@ip-10-148-131-236:~#



 Thanks again.



 Dan



--

Re: [GENERAL] Standby stopped working after PANIC: WAL contains references to invalid pages

2013-06-22 Thread Lonni J Friedman

Assuming that you still have $PGDATA from the broken instance (such
that you can reproduce the crash again), there might be a way to debug
it further.  I'd guess that something like bad RAM or storage could
cause an index to get corrupted in this fashion, but the fact that
you're using AWS makes that less likely.  Someone far more
knowledgeable than I will need to provide guidance on how to debug
this though.

On Sat, Jun 22, 2013 at 4:17 PM, Dan Kogan d...@iqtell.com wrote:
 Re-seeding the standby with a full base backup does seem to make the error go 
 away.
 The standby started, caught up and has been working for about 2 hours.

 The file in the error message was an index.  We rebuilt it just in case.
 Is there any way to debug the issue at this point?



 -Original Message-
 From: Lonni J Friedman [mailto:netll...@gmail.com]
 Sent: Saturday, June 22, 2013 4:11 PM
 To: Dan Kogan
 Cc: pgsql-general@postgresql.org
 Subject: Re: [GENERAL] Standby stopped working after PANIC: WAL contains 
 references to invalid pages

 Looks like some kind of data corruption.  Question is whether it came from 
 the master, or was created by the standby.  If you re-seed the standby with a 
 full (base) backup, does the problem go away?

 On Sat, Jun 22, 2013 at 12:43 PM, Dan Kogan d...@iqtell.com wrote:
 Hello,



 Today our standby instance stopped working with this error in the log:



 2013-06-22 16:27:32 UTC [8367]: [247-1] [] WARNING:  page 158130 of
 relation
 pg_tblspc/16447/PG_9.2_201204301/16448/39154429 is uninitialized

 2013-06-22 16:27:32 UTC [8367]: [248-1] [] CONTEXT:  xlog redo vacuum:
 rel 16447/16448/39154429; blk 158134, lastBlockVacuumed 158129

 2013-06-22 16:27:32 UTC [8367]: [249-1] [] PANIC:  WAL contains
 references to invalid pages

 2013-06-22 16:27:32 UTC [8367]: [250-1] [] CONTEXT:  xlog redo vacuum:
 rel 16447/16448/39154429; blk 158134, lastBlockVacuumed 158129

 2013-06-22 16:27:32 UTC [8366]: [3-1] [] LOG:  startup process (PID
 8367) was terminated by signal 6: Aborted

 2013-06-22 16:27:32 UTC [8366]: [4-1] [] LOG:  terminating any other
 active server processes



 After re-start the same exact error occurred.



 We thought that maybe we hit this bug -
 http://postgresql.1045698.n5.nabble.com/Completely-broken-replica-after-PANIC-WAL-contains-references-to-invalid-pages-td5750072.html.

 However, there is nothing in our log about sub-transactions, so it
 didn't seem the same to us.



 Any advice on how to further debug this so we can avoid this in the
 future is appreciated.



 Environment:



 AWS, High I/O instance (hi1.4xlarge), 60GB RAM



 Software and settings:



 PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc
 (Ubuntu/Linaro
 4.5.2-8ubuntu4) 4.5.2, 64-bit



 archive_command  rsync -a %p
 slave:/var/lib/postgresql/replication_load/%f

 archive_mode   on

 autovacuum_freeze_max_age 10

 autovacuum_max_workers6

 checkpoint_completion_target 0.9

 checkpoint_segments   128

 checkpoint_timeout   30min

 default_text_search_config   pg_catalog.english

 hot_standby  on

 lc_messages  en_US.UTF-8



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

1 2 3 4 5 6 7 8 >

1 - 100 of 719 matches

Mail list logo