Re: [GENERAL] Server/Client Encoding Errors

2009-07-27 Thread Albe Laurenz
APseudoUtopia wrote:
 I'm having some problems when inserting special characters into a
 column. Here's the table:
 
 --
  Table public.users_history_ip
Column   |Type |   
 Modifiers
 +-+---
  id | bigint  | not null default 
 nextval('users_history_ip_id_seq'::regclass)
  userid | integer | not null
  ip | inet| not null
  hostname   | character varying(512)  | not null
  geoip_info | character varying(512)  | not null
  start_time | timestamp without time zone | not null
  last_seen  | timestamp without time zone | not null
  type   | ip_history_type | not null
 Indexes:
 users_history_ip_pkey PRIMARY KEY, btree (id)
 Foreign-key constraints:
 users_history_ip_userid_fkey FOREIGN KEY (userid) REFERENCES 
 users_main(id) ON DELETE CASCADE
 --
 
 I'm trying to insert information into the geoip_info column. Here's
 some of the information that I'm trying to insert, and the errors:
 
 'Portugal, 09, Vila Real De Santo António'
 ERROR:  invalid byte sequence for encoding UTF8: 0xf36e696f
 
 'Norway, 08, Ålesund'
 ERROR:  invalid byte sequence for encoding UTF8: 0xc56c
 
 'Portugal, 04, Vila Nova De Famalicão'
 ERROR:  invalid byte sequence for encoding UTF8: 0xe36f2c
 
 The locale on the server is C and the encoding is UTF8. I thought
 the UTF8 encoding would allow characters like this? Why is it
 disallowing it?
 Note, the GeoIP info is generated automatically by a module, so I am
 unable to determine exactly what characters will be returned.

The UTF8 encoding allows you to store the characters ó, Å and ã, but
you have to encode them correctly.

Judging from the error messages, you have set your client_encoding to UTF8,
but feed data that are encoded in LATIN1 or WIN1252.

If you feed the client LATIN1 data, set client_encoding to LATIN1 so
that PostgreSQL can correctly convert the characters to UTF-8.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] synchronous_commit=off doesn't always return immediately

2009-07-27 Thread tomrevam



Alvaro Herrera-7 wrote:
 
 This is why Tom was suggesting you to increase wal_buffers.  Did
 you try that?
 

Thanks for the explanation. I will try increasing the wal_buffers.
Unfortunately this is on a system I can't restart for the next few days.

Tomer
-- 
View this message in context: 
http://www.nabble.com/synchronous_commit%3Doff-doesn%27t-always-return-immediately-tp24621119p24675696.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Very slow joins

2009-07-27 Thread MS
 postgres collect all necessary stats. Maybe an implicit analyze is
 necessary?

Should be: explicit analyze.

   BUT I found the real cause of my problem - the fk2 field from my
   example had not only an index, but it was also a foreign key to
   another table.
  That seems unlikely to be the cause.

It's just what I saw. First I tried with all the FKs but had to break
the queries because they took too long.
Then I dropped the FK and the query run in a couple of minutes.


Thanks,
MS


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Very slow joins

2009-07-27 Thread MS
 What first post? The only thing I can find is a reference in a message  
 by you from yesterday, to a two-year old post that you claim is about  
 the same problem. Though it's possible that it is the same problem,  
 you don't provide any data to back that up.

Strange - you can see the full thread here:
http://groups.google.pl/group/pgsql.general/browse_thread/thread/6c4ea8356327276c
I post via Google, maybe that's the cause?

 The message you referred to was about a one-of-a-kind problem with  
 communications to the client and had nothing to do with performance on  
 the server; is that indeed what you're seeing? In that case you should  
 check your network infrastructure for problems.

No, I actually meant that the CPU usage was low during my query.
On the other hand IO wait was very high so the low CPU usage was kind
of normal.

 Usually server performance problems are due to problems with tuning  
 parameters or outdated statistics. Those issues can usually be solved  
 easily.

Well, maybe - I started to use postgres only recently, so maybe I'm
doing some obvious
mistakes. My database was filled incrementally by a shell script - I
don't know if that alone lets
postgres collect all necessary stats. Maybe an implicit analyze is
necessary?

I also tried to vacuum/vacuum full my tables before running my queries
but it took too long so I had to break it.


 Without posting an EXPLAIN ANALYSE people here can only guess what  
 your problem is.

I'm posting another explain analyze below.
I've run my query with explain analyze, but forgot to save it :( I
think I won't be able to run my queries again, because
they took around 30-60 minutes and almost killed my server (was almost
completely unresponsive during
the query, because of 90% IO wait).

  BUT I found the real cause of my problem - the fk2 field from my
  example had not only an index, but it was also a foreign key to
  another table.
 That seems unlikely to be the cause. From the above it seems much more  
 likely that you're suffering from a bad query plan instead, but you  
 don't provide any details.

I just tried to create a test with similar data - here is what it
looks like:
(it should work if you just paste it in some Test db)

-- - generate test tables + data

drop table if exists article, keyword, article_keyword, tochange, sums
cascade;

CREATE TABLE article (
id serial NOT NULL PRIMARY KEY,
content varchar(255) NULL,
ip inet NULL,
has_comments bool not null
)
;


CREATE TABLE keyword (
id serial NOT NULL PRIMARY KEY,
keyword varchar(40) NOT NULL UNIQUE,
articles integer NOT NULL
)
;
CREATE TABLE article_keyword (
id serial NOT NULL PRIMARY KEY,
article_id integer NOT NULL REFERENCES article (id)
DEFERRABLE INITIALLY DEFERRED,
keyword_id integer NOT NULL REFERENCES keyword (id)
DEFERRABLE INITIALLY DEFERRED,
votes_yes integer NOT NULL,
votes_no integer NOT NULL
)
;
CREATE INDEX article_keyword_keyword_id ON
article_keyword (keyword_id);



insert into article(content, ip, has_comments) values ('some article',
'123.121.121.223', true);


insert into keyword
select nextval('keyword_id_seq'), md5(to_char(i, '9')), 0
from generate_series(1,200) as i;

insert into article_keyword
select nextval('article_keyword_id_seq'), 1, k.id, 0, 0 from
generate_series(1,200) as i
join keyword k on k.keyword=md5(to_char(i, '9'))
join generate_series(1,5) as times on true
;

create table tochange (
fromid int not null primary key,
toid int not null
);

insert into tochange
select k1.id, k2.id from
generate_series(1,20) as i
join keyword k1 on k1.keyword=md5(to_char(i, '9'))
join keyword k2 on k2.keyword=md5(to_char(i+20, '9'))
;


create table sums (
id int not null primary key,
sum int
);


-- - now my queries:


-- replace fromid's with toid's

update article_keyword
set keyword_id=tc.toid
from tochange tc
where
keyword_id=tc.fromid
;


-- delete unused keywords
delete from article_keyword
where id in (
select k.id
from keyword k
left join article_keyword ak on k.id=ak.keyword_id
where ak.keyword_id is null
)
;


-- recalculate sums - in how many articles is a keyword used?
insert into sums
select keyword_id, count(*)
from article_keyword
group by keyword_id;

update keyword k
set articles=s.sum
from
sums s
where
k.id=s.id;

--

The problem is that I can't reproduce this slow behaviour with this
test case. :(
The tables are almost identical - only the article table is bigger in
reality (it has around million rows)

When I run explain update (first update from the test case) it
prints this now:

 Merge Join  (cost=5.14..53436.13 rows=3636710 width=26)
   Merge Cond: (tc.fromid = article_keyword.keyword_id)
   -  Index Scan using tochange_pkey on tochange tc
(cost=0.00..2830.26 rows=10 width=8)
   -  Index Scan using article_keyword_keyword_id on article_keyword

[GENERAL] Postgres and multiple updates in one statement

2009-07-27 Thread Jamie Lawrence-Jenner
Hi there

 

We have a function which runs a set of update clauses and we are considering
putting all the update clauses into one statement.

 

I would like to understand how postgres handles multiple updates. If we were
to send 5 update statements in one sql statement to the db would it:

 

Do 5 passes on the table, on each pass, retrieve the id then update the row

 

Do 1 pass to retrieve the 5 primary keys, then update all rows in parallel

 

Apart from saving on the overhead of having to open up 5 separate
connections, what are the benefits to passing in multiple updates in one
statement?

 

Many thanks,

 

Jamie

 



Re: [GENERAL] Postgres and multiple updates in one statement

2009-07-27 Thread Raymond O'Donnell
On 27/07/2009 09:10, Jamie Lawrence-Jenner wrote:

 Apart from saving on the overhead of having to open up 5 separate
 connections, what are the benefits to passing in multiple updates in one
 statement?

If you do them all within one transaction -

  begin;
  update
  update...
  ...
  commit;

- then you save on the overhead associated with beginning and committing
a transaction for each update.

Ray.

--
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
r...@iol.ie
Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals
--

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres and multiple updates in one statement

2009-07-27 Thread Scott Marlowe
On Mon, Jul 27, 2009 at 2:10 AM, Jamie
Lawrence-Jennerjamie.jen...@autovhc.co.uk wrote:
 Hi there

 We have a function which runs a set of update clauses and we are considering
 putting all the update clauses into one statement.

 I would like to understand how postgres handles multiple updates. If we were
 to send 5 update statements in one sql statement to the db would it:

 Do 5 passes on the table, on each pass, retrieve the id then update the row

 Do 1 pass to retrieve the 5 primary keys, then update all rows in parallel

I would do 5 passes.  Better to have one update statement to reduce bloat.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres and multiple updates in one statement

2009-07-27 Thread nha
Hello,

Le 27/07/09 15:07, Raymond O'Donnell a écrit :
 On 27/07/2009 09:10, Jamie Lawrence-Jenner wrote:
 
 Apart from saving on the overhead of having to open up 5 separate
 connections, what are the benefits to passing in multiple updates in one
 statement?
 
 If you do them all within one transaction -
 
   begin;
   update
   update...
   ...
   commit;
 
 - then you save on the overhead associated with beginning and committing
 a transaction for each update.
 

Next to the transaction way suggested by Raymond O'Donnell, I would add
that performance would depend on FROM and WHERE clauses specified in the
original UPDATE statements.

In the case of multiple UPDATE statements following a quite similar
schema (ie. similar FROM (optional) and WHERE clauses), it might be
clearer (for the source code) and faster (for the database engine) to
merge them in a single UPDATE statement. Otherwise (ie. UPDATE
statements with not so much FROM and/or WHERE clauses in common),
transaction 1-block statement as suggested by Raymond O'Donnell would
certainly be the more appropriate.

Improvements on the merging UPDATE statements may thence be advised if
some pieces of original statements could be given--without compromising
confidential data.

Regards.

--
nha / Lyon / France.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres and multiple updates in one statement

2009-07-27 Thread Joshua Tolley
On Mon, Jul 27, 2009 at 07:31:37AM -0600, Scott Marlowe wrote:
 On Mon, Jul 27, 2009 at 2:10 AM, Jamie
 Lawrence-Jennerjamie.jen...@autovhc.co.uk wrote:
  Do 1 pass to retrieve the 5 primary keys, then update all rows in parallel
 
 I would do 5 passes.  Better to have one update statement to reduce bloat.

You could possibly use UNION or UNION ALL to consolidate your 5 passes into
one pass. You could also possibly use UPDATE FROM to avoid having to return
the primary keys at all, and get the whole thing done in one query.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


[GENERAL] Calculating the difference between timetz values

2009-07-27 Thread Alexey Klyukin

Hello,

I was looking for a way to get the difference (interval) between 2  
timetz values, i.e.:


postgres=# select '2:45+7'::timetz - '2:44+2'::timetz;
ERROR:  operator does not exist: time with time zone - time with time  
zone

LINE 1: select '2:45+7'::timetz - '2:44+2'::timetz;

I'd expect the result of the above to be the interval of 05:01:00. Is  
there any function or operator that calculates the difference  
correctly ? I've found a way to cast a timetz to the time without TZ  
and substract resulting time values, but it's not a correct solution  
for the problem above due to the loss of all TZ information.


Regards,
--
Alexey Klyukin   http://www.CommandPrompt.com
The PostgreSQL Company - Command Prompt, Inc.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres and multiple updates in one statement

2009-07-27 Thread Jamie Lawrence-Jenner
Hi There

Our update statements are  as follows

Update table set col1=x,col2=y where pkid=1;
Update table set col1=x,col2=y where pkid=2;
Update table set col1=x,col2=y where pkid=3;

Very simple and straight forward. Sometimes there could be as many as 50
update statements to process.

Many thanks

Jamie


-Original Message-
From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of nha
Sent: 27 July 2009 14:35
To: Jamie Lawrence-Jenner
Cc: PgSQL-general
Subject: Re: [GENERAL] Postgres and multiple updates in one statement

Hello,

Le 27/07/09 15:07, Raymond O'Donnell a écrit :
 On 27/07/2009 09:10, Jamie Lawrence-Jenner wrote:
 
 Apart from saving on the overhead of having to open up 5 separate
 connections, what are the benefits to passing in multiple updates in one
 statement?
 
 If you do them all within one transaction -
 
   begin;
   update
   update...
   ...
   commit;
 
 - then you save on the overhead associated with beginning and committing
 a transaction for each update.
 

Next to the transaction way suggested by Raymond O'Donnell, I would add
that performance would depend on FROM and WHERE clauses specified in the
original UPDATE statements.

In the case of multiple UPDATE statements following a quite similar
schema (ie. similar FROM (optional) and WHERE clauses), it might be
clearer (for the source code) and faster (for the database engine) to
merge them in a single UPDATE statement. Otherwise (ie. UPDATE
statements with not so much FROM and/or WHERE clauses in common),
transaction 1-block statement as suggested by Raymond O'Donnell would
certainly be the more appropriate.

Improvements on the merging UPDATE statements may thence be advised if
some pieces of original statements could be given--without compromising
confidential data.

Regards.

--
nha / Lyon / France.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres and multiple updates in one statement

2009-07-27 Thread nha
Hello again,

Le 27/07/09 16:48, Jamie Lawrence-Jenner a écrit :
 Hi There
 
 Our update statements are  as follows
 
 Update table set col1=x,col2=y where pkid=1;
 Update table set col1=x,col2=y where pkid=2;
 Update table set col1=x,col2=y where pkid=3;
 
 Very simple and straight forward. Sometimes there could be as many as 50
 update statements to process.
 
 Many thanks
 
 Jamie
 [...]
 Improvements on the merging UPDATE statements may thence be advised if
 some pieces of original statements could be given--without compromising
 confidential data.
 

Thanks for the examples.

Comparison between explain analyze outputs from the two ways (multiple
statements vs. one statement) should help choosing the faster.

For the one-statement schema, the rewritten query could be:

UPDATE yTable SET col1=x, col2=y WHERE pkID IN (1, 2, 3);

Lists of pkID would rather be expressed in terms of enumeration when in
WHERE clause. In a more general situation, I would recommand to
determine pkIDs list before building UPDATE statement(s) if possible.
This hint would surely save runtime.

With regards.
--
nha / Lyon / France.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres and multiple updates in one statement

2009-07-27 Thread Scott Marlowe
On Mon, Jul 27, 2009 at 8:48 AM, Jamie
Lawrence-Jennerjamie.jen...@autovhc.co.uk wrote:
 Hi There

 Our update statements are  as follows

 Update table set col1=x,col2=y where pkid=1;
 Update table set col1=x,col2=y where pkid=2;
 Update table set col1=x,col2=y where pkid=3;

 Very simple and straight forward. Sometimes there could be as many as 50
 update statements to process.

Might as well fire them each as separate statements inside one
transaction, since pkid is unique and non-repeated in the updates.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres and multiple updates in one statement

2009-07-27 Thread Tom Lane
Scott Marlowe scott.marl...@gmail.com writes:
 On Mon, Jul 27, 2009 at 8:48 AM, Jamie
 Lawrence-Jennerjamie.jen...@autovhc.co.uk wrote:
 Our update statements are  as follows
 
 Update table set col1=x,col2=y where pkid=1;
 Update table set col1=x,col2=y where pkid=2;
 Update table set col1=x,col2=y where pkid=3;
 
 Very simple and straight forward. Sometimes there could be as many as 50
 update statements to process.

 Might as well fire them each as separate statements inside one
 transaction, since pkid is unique and non-repeated in the updates.

If they're all exactly the same pattern like that, it might be worth the
trouble to set up a prepared statement.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Running vacuumdb -a taking too long

2009-07-27 Thread Keaton Adams
We are to a point in size and utilization of a set of our Postgres 8.1.17 
databases that a vacuumdb -a -z -U postgres is still running after 50 hours and 
we have to kill it off because the additional i/o is causing queries to stack 
up.  We have archived off as much data as possible (100+ GB) and we are looking 
at moving to a better hardware configuration to support the increased system 
activity, but for now I need to know if there is a way to avoid transaction ID 
wraparound if we don't issue a vacuumdb -a to update the datfrozenxid value.

This is a logging type database where data is initially loaded and then rolled 
up into weekly/monthly tables.  Many of the tables in the DB don't last for 
more than seven days, and after a week the majority of the historical tables 
are static and stay around until they are finally dropped off (based on a 
retention period), so re-running a vacuum on these older static tables doesn't 
really gain much since the data never changes.

I have read (and re-read) the documentation and am  a bit confused on exactly 
what needs to happen here:

Since periodic VACUUM runs are needed anyway for the reasons described earlier, 
it's unlikely that any table would not be vacuumed for as long as a billion 
transactions. But to help administrators ensure this constraint is met, VACUUM 
stores transaction ID statistics in the system table pg_database. In 
particular, the datfrozenxid column of a database's pg_database row is updated 
at the completion of any database-wide VACUUM operation (i.e., VACUUM that does 
not name a specific table). The value stored in this field is the freeze cutoff 
XID that was used by that VACUUM command. All normal XIDs older than this 
cutoff XID are guaranteed to have been replaced by FrozenXID within that 
database. A convenient way to examine this information is to execute the query

SELECT datname, age(datfrozenxid) FROM pg_database;

The age column measures the number of transactions from the cutoff XID to the 
current transaction's XID.

So if, after a table is no longer added to (becomes static), I run a VACUUM 
against it, the table wouldn't need to be vacuumed again since the tuples and 
their related transaction ID's never change?  Is there a way to set up a vacuum 
scheme on a table-by-table basis to accomplish the same goal as an all-in-one 
vacuumdb -a run that I can spread out over time instead of relying on the 
completion of a single vacuumdb -a command?



Re: [GENERAL] Calculating the difference between timetz values

2009-07-27 Thread Michael Glaesemann


On Jul 27, 2009, at 10:54 , Alexey Klyukin wrote:


Hello,

I was looking for a way to get the difference (interval) between 2  
timetz values, i.e.:


I don't have a solution, but am curious what your use case is for  
timetz (as opposed to timestamptz).


Michael Glaesemann
grzm seespotcode net




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Run procedure at startup

2009-07-27 Thread Saleem EDAH-TALLY
Hello,

Is there a way to run a pl/pgsql automatically at server startup ?

Is there a way to run a pl/pgsql function with an infinite loop as a daemon ?

Is there a way to start a pl/pgsql function that would persist after the user 
session has closed ?

Is there a way for an unprivileged user to delegate a task (allowed by 
superuser) to a superuser ?

Thanks in advance.



Re: [GENERAL] Run procedure at startup

2009-07-27 Thread John R Pierce

Saleem EDAH-TALLY wrote:

Is there a way to run a pl/pgsql automatically at server startup ?


in your postgres startup script launch a session with `psql ... -c some 
sql commands` or `psql ... -f somescript.sql` ...



Is there a way to run a pl/pgsql function with an infinite loop as a 
daemon ?


functions are called from within a transaction.   if you did this, that 
transaction would never end, and this would prevent VACUUM from cleaning 
up any freed tuples from newer than the start of that transaction.   Not 
good.   you could, however, have a system daemon that periodically 
invokes a plpgsql function.



Is there a way to start a pl/pgsql function that would persist after 
the user session has closed ?


no.


Is there a way for an unprivileged user to delegate a task (allowed by 
superuser) to a superuser ?


someone else will have to chime in here.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Run procedure at startup

2009-07-27 Thread Pavel Stehule
Hello

2009/7/27 Saleem EDAH-TALLY nm...@netcourrier.com:
 Hello,

 Is there a way to run a pl/pgsql automatically at server startup ?

no - only you can modify startup scripts


 Is there a way to run a pl/pgsql function with an infinite loop as a daemon
 ?

infinite loop is possible, but probably you need orafce or similar to
be infinite loop usable:

http://www.postgres.cz/index.php/Oracle_functionality_%28en%29#DBMS_ALERT
http://www.postgres.cz/index.php/Oracle_functionality_%28en%29#DBMS_PIPE


 Is there a way to start a pl/pgsql function that would persist after the
 user session has closed ?

yes - with some pooling sw like pgpool, bucardo and others


 Is there a way for an unprivileged user to delegate a task (allowed by
 superuser) to a superuser ?

yes - look on security definer flag
http://www.postgresql.org/docs/8.4/static/sql-createfunction.html


 Thanks in advance.



regards
Pavel Stehule

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Calculating the difference between timetz values

2009-07-27 Thread Alexey Klyukin


On Jul 27, 2009, at 6:52 PM, Michael Glaesemann wrote:



On Jul 27, 2009, at 10:54 , Alexey Klyukin wrote:


Hello,

I was looking for a way to get the difference (interval) between 2  
timetz values, i.e.:


I don't have a solution, but am curious what your use case is for  
timetz (as opposed to timestamptz).


I'm writing a custom trigger function that has to compare values of  
time* types and make some actions depending on a result.


Answering my own question, Alvaro proposed a solution with  
extract(epoch from tz_value), the result is in seconds:


postgres=# select extract(epoch from '00:00:00+0'::timetz) -  
extract(epoch from '2:00:00+2'::timetz);

 ?column?
--
0
(1 row)

--
Alexey Klyukin   http://www.CommandPrompt.com
The PostgreSQL Company - Command Prompt, Inc.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] combining db's- importing primary keys

2009-07-27 Thread Bob Gobeille
I would like to combine multiple databases (same schema) into one  
master db.   Does anyone know how I can reconcile all my primary and  
foreign keys to maintain referential integrity.


Many thanks,
Bob

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres and multiple updates in one statement

2009-07-27 Thread Joshua D. Drake
On Mon, 2009-07-27 at 11:50 -0400, Tom Lane wrote:
 Scott Marlowe scott.marl...@gmail.com writes:
  On Mon, Jul 27, 2009 at 8:48 AM, Jamie
  Lawrence-Jennerjamie.jen...@autovhc.co.uk wrote:
  Our update statements are as follows
  
  Update table set col1=x,col2=y where pkid=1;
  Update table set col1=x,col2=y where pkid=2;
  Update table set col1=x,col2=y where pkid=3;
  
  Very simple and straight forward. Sometimes there could be as many as 50
  update statements to process.
 
  Might as well fire them each as separate statements inside one
  transaction, since pkid is unique and non-repeated in the updates.
 
 If they're all exactly the same pattern like that, it might be worth the
 trouble to set up a prepared statement.

Seems like an opportunity for the use of a function.

Joshua D. Drake


 
   regards, tom lane
 
-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Calculating the difference between timetz values

2009-07-27 Thread Tom Lane
Alexey Klyukin al...@commandprompt.com writes:
 On Jul 27, 2009, at 6:52 PM, Michael Glaesemann wrote:
 I don't have a solution, but am curious what your use case is for  
 timetz (as opposed to timestamptz).

 I'm writing a custom trigger function that has to compare values of  
 time* types and make some actions depending on a result.

It's still fairly unclear why you think that comparing timetz values
is a useful activity.  Is 23:32 earlier or later than 00:32?
How can you tell whether it's the same day or different days?  Adding
timezones into that doesn't make it better.

Our documentation deprecates timetz as a poorly-defined datatype,
and I've never seen a reason to argue with that judgment.  I'd suggest
taking a very hard look at why you're not using timestamptz instead.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Relational Algebra and Aggregate Functions

2009-07-27 Thread Jeff Davis
On Sun, 2009-07-26 at 15:36 -0400, Robert James wrote:
 I'm working on improving my background database theory, to aid in
 practice.  I've found learning relational algebra to be very helpful.
  One thing which relational algebra doesn't cover is aggregate
 functions.  Can anyone recommend any papers or web pages which provide
 some good theoretical background for aggregate functions?

When it comes to relational theory, C.J. Date is a good author. An
Introduction To Database Systems covers pretty much everything.

There's a formal definition of a relational algebra (including
SUMMARIZE, which is the authors' version of an aggregate operator)
defined with only two operators here:
http://thethirdmanifesto.com/
(look for Appendix A)

Although Appendix A is not easy to understand without some basic
familiarity with the authors' other works.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running vacuumdb -a taking too long

2009-07-27 Thread Bill Moran
In response to Keaton Adams kad...@mxlogic.com:

 We are to a point in size and utilization of a set of our Postgres 8.1.17 
 databases that a vacuumdb -a -z -U postgres is still running after 50 hours 
 and we have to kill it off because the additional i/o is causing queries to 
 stack up.  We have archived off as much data as possible (100+ GB) and we are 
 looking at moving to a better hardware configuration to support the increased 
 system activity, but for now I need to know if there is a way to avoid 
 transaction ID wraparound if we don't issue a vacuumdb -a to update the 
 datfrozenxid value.
 
 This is a logging type database where data is initially loaded and then 
 rolled up into weekly/monthly tables.  Many of the tables in the DB don't 
 last for more than seven days, and after a week the majority of the 
 historical tables are static and stay around until they are finally dropped 
 off (based on a retention period), so re-running a vacuum on these older 
 static tables doesn't really gain much since the data never changes.
 
 I have read (and re-read) the documentation and am  a bit confused on exactly 
 what needs to happen here:
 
 Since periodic VACUUM runs are needed anyway for the reasons described 
 earlier, it's unlikely that any table would not be vacuumed for as long as a 
 billion transactions. But to help administrators ensure this constraint is 
 met, VACUUM stores transaction ID statistics in the system table pg_database. 
 In particular, the datfrozenxid column of a database's pg_database row is 
 updated at the completion of any database-wide VACUUM operation (i.e., VACUUM 
 that does not name a specific table). The value stored in this field is the 
 freeze cutoff XID that was used by that VACUUM command. All normal XIDs older 
 than this cutoff XID are guaranteed to have been replaced by FrozenXID within 
 that database. A convenient way to examine this information is to execute the 
 query
 
 SELECT datname, age(datfrozenxid) FROM pg_database;
 
 The age column measures the number of transactions from the cutoff XID to the 
 current transaction's XID.
 
 So if, after a table is no longer added to (becomes static), I run a VACUUM 
 against it, the table wouldn't need to be vacuumed again since the tuples and 
 their related transaction ID's never change?  Is there a way to set up a 
 vacuum scheme on a table-by-table basis to accomplish the same goal as an 
 all-in-one vacuumdb -a run that I can spread out over time instead of relying 
 on the completion of a single vacuumdb -a command?

I'm not quite sure what the best answer is to your problem, but since
nobody else has suggested anything, here are my thoughts.

First off, can you allocate more maintenance_work_mem?  Perhaps that can
speed up vacuum enough.

Secondly, if you VACUUM FREEZE those static tables, it will guarantee
that you'll never lose data from them, even if you hit XID wraparound.
If you VACUUM FREEZE tables one at a time, perhaps you can avoid the
huge performance hit.

Third, while your argument about tables not needing vacuumed again seems
logical, it's simply not how PG functions.  Since the XID is database-wide,
it may affect any table.  Of course, the FREEZE process will protect tables
from this.  This seems to be improved in newer versions of Postgres, so
an upgrade should improve the issue.

Finally, are you really in danger of hitting the wraparound?  If you run
the query SELECT datname, age(datfrozenxid) FROM pg_database; (as suggested
in the docs) once a day for a few days, does it seems like you're using
up XIDs fast enough to be a danger?  If you've got new hardware coming
soon anyway, perhaps you have enough time to now worry about it on the
current hardware?

Hope this helps.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Phoenix Kiula
Just looking for experiences of people. Are people already using 8.4
in serious live hosting environments? Thanks.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Video available for PGDay SJC '09

2009-07-27 Thread Christophe Pettus

Greetings,

The video recordings of the sessions for PG Day SJC '09 are now  
available:


Version 8.4: Easier to Administer than Ever / Josh Berkus / PostgreSQL  
Experts

 http://media.postgresql.org/pgday-sjc-09/pgday-sjc-09-easier.mov

Rapid Upgrades with pg_Migrator / Bruce Momjian / EnterpriseDB
 http://media.postgresql.org/pgday-sjc-09/pgday-sjc-09-migrator.mov

Check Please! What Your Postgres Databases Wishes You Would Monitor /  
Robert Treat / OmniTI

 http://media.postgresql.org/pgday-sjc-09/pgday-sjc-09-monitoring.mov

PostgreSQL Pitfalls / Jeff Davis / Truviso
 http://media.postgresql.org/pgday-sjc-09/pgday-sjc-09-pitfalls.mov

What works with Postgres: The Open Geo Data Interoperabilty Overview /  
Brian Hamlin / OSGeo Foundation

 http://media.postgresql.org/pgday-sjc-09/pgday-sjc-09-postgis.mov

Very Large Databases and PostgreSQL: Issues and Solutions / John  
Cieslewicz / Aster Data

 http://media.postgresql.org/pgday-sjc-09/pgday-sjc-09-petabytes.mov

pgGearman: A distributed worker queue for PostgreSQL / Brian Aker,  
Eric Day / Gearman Project

 http://media.postgresql.org/pgday-sjc-09/pgday-sjc-09-gearman.mov

Lightning Talks
 http://media.postgresql.org/pgday-sjc-09/pgday-sjc-09-lightning.mov


Many thanks to Steve Crawford for the audio system and assistance with  
setup, tear-down, and coordination during the event.


--
-- Christophe Pettus
 x...@thebuild.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Devrim GÜNDÜZ
On Tue, 2009-07-28 at 03:51 +0800, Phoenix Kiula wrote:
 Are people already using 8.4 in serious live hosting environments?

Not yet. There are lots of (important) fixes in CVS which are waiting
for 8.4.1. For production, I'd wait for a while.
-- 
Devrim GÜNDÜZ, RHCE
Command Prompt - http://www.CommandPrompt.com 
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
   http://www.gunduz.org


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Running vacuumdb -a taking too long

2009-07-27 Thread Keaton Adams
We are upgrading to 8.3.7 in September, if that helps the situation at all.

I just want to make sure I thoroughly understand how these two statements work 
together:

The price is this maintenance requirement: every table in the database must be 
vacuumed at least once every billion transactions.

For each database in the cluster, PostgreSQL keeps track of the time of the 
last database-wide VACUUM. When any database approaches the billion-transaction 
danger level, the system begins to emit warning messages. If nothing is done, 
it will eventually shut down normal operations until appropriate manual 
maintenance is done.


Let's say I have two sets of tables in the production DB, one set is always 
around and the other set rolls forward in time using CREATE/DROP table 
(_MMDD), with the oldest table being 14 days old.

I never issue the actual vacuumdb -a command against the entire cluster, but 
I do a DB wide VACUUM; against databases such as postgres and template1 once 
a week.

I issue a VACUUM table once a day against the tables that always exist in the 
database (they do receive inserts/updates/deletes daily) because they are small 
enough to vacuum without a big performance hit.  This includes everything that 
is shown as a real table in pg_tables except those that match the filter of 
'%_MMDD' to avoid the rolling table set.

The large day tables (rolling forward in time) I never vacuum because I just 
COPY into them, and if there is a failed COPY operation or two I don't really 
care about the wasted space, due to the fact that the table has a limited 
lifespan anyway. So these tables are always rolling forward and being dropped 
in a 14 day window.

So my questions are:


 1.  Will the Postgres cluster eventually shut down because I never do a true 
database-wide VACUUM using a vacuumdb -a command on all of the databases, 
even though I vacuum the tables in the production database that have a lifespan 
of greater than 14 days?
 2.  Would I ever be at risk of losing data in a table that is only around for 
a two week (14 day period) if I never do this database wide VACUUM on the 
actual production DB?

Thanks again for your response.




On 7/27/09 1:41 PM, Bill Moran wmo...@potentialtech.com wrote:

In response to Keaton Adams kad...@mxlogic.com:

 We are to a point in size and utilization of a set of our Postgres 8.1.17 
 databases that a vacuumdb -a -z -U postgres is still running after 50 hours 
 and we have to kill it off because the additional i/o is causing queries to 
 stack up.  We have archived off as much data as possible (100+ GB) and we are 
 looking at moving to a better hardware configuration to support the increased 
 system activity, but for now I need to know if there is a way to avoid 
 transaction ID wraparound if we don't issue a vacuumdb -a to update the 
 datfrozenxid value.

 This is a logging type database where data is initially loaded and then 
 rolled up into weekly/monthly tables.  Many of the tables in the DB don't 
 last for more than seven days, and after a week the majority of the 
 historical tables are static and stay around until they are finally dropped 
 off (based on a retention period), so re-running a vacuum on these older 
 static tables doesn't really gain much since the data never changes.

 I have read (and re-read) the documentation and am  a bit confused on exactly 
 what needs to happen here:

 Since periodic VACUUM runs are needed anyway for the reasons described 
 earlier, it's unlikely that any table would not be vacuumed for as long as a 
 billion transactions. But to help administrators ensure this constraint is 
 met, VACUUM stores transaction ID statistics in the system table pg_database. 
 In particular, the datfrozenxid column of a database's pg_database row is 
 updated at the completion of any database-wide VACUUM operation (i.e., VACUUM 
 that does not name a specific table). The value stored in this field is the 
 freeze cutoff XID that was used by that VACUUM command. All normal XIDs older 
 than this cutoff XID are guaranteed to have been replaced by FrozenXID within 
 that database. A convenient way to examine this information is to execute the 
 query

 SELECT datname, age(datfrozenxid) FROM pg_database;

 The age column measures the number of transactions from the cutoff XID to the 
 current transaction's XID.

 So if, after a table is no longer added to (becomes static), I run a VACUUM 
 against it, the table wouldn't need to be vacuumed again since the tuples and 
 their related transaction ID's never change?  Is there a way to set up a 
 vacuum scheme on a table-by-table basis to accomplish the same goal as an 
 all-in-one vacuumdb -a run that I can spread out over time instead of relying 
 on the completion of a single vacuumdb -a command?

I'm not quite sure what the best answer is to your problem, but since
nobody else has suggested anything, here are my thoughts.

First off, can you allocate more 

Re: [GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Tory M Blue
On Mon, Jul 27, 2009 at 12:51 PM, Phoenix Kiulaphoenix.ki...@gmail.com wrote:
 Just looking for experiences of people. Are people already using 8.4
 in serious live hosting environments? Thanks.


Wait..

 8.3 is running fine and dandy. Lots of decent sized changes in 8.4
with awaiting fixes. So wait.

And those that have multiple TB's of data, weee another dump and
restore upgrade (pt!)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Thomas Kellerer

Tory M Blue wrote on 27.07.2009 22:45:

And those that have multiple TB's of data, weee another dump and
restore upgrade (pt!)


Isn't that what pg_migrator is for?



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Scott Mead
On Mon, Jul 27, 2009 at 4:45 PM, Tory M Blue tmb...@gmail.com wrote:



 And those that have multiple TB's of data, weee another dump and
 restore upgrade (pt!)


  pg_migrator doesn't need to dump - restore, it can do an in-place upgrade
of the datafiles for you.

http://archives.postgresql.org/pgsql-committers/2009-06/msg00031.php

--Scott


Re: [GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Scott Marlowe
On Mon, Jul 27, 2009 at 2:48 PM, Thomas Kellererspam_ea...@gmx.net wrote:
 Tory M Blue wrote on 27.07.2009 22:45:

 And those that have multiple TB's of data, weee another dump and
 restore upgrade (pt!)

 Isn't that what pg_migrator is for?

I use slony for such things, downtime = zero (ok a few seconds)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Joshua D. Drake
On Mon, 2009-07-27 at 22:48 +0200, Thomas Kellerer wrote:
 Tory M Blue wrote on 27.07.2009 22:45:
  And those that have multiple TB's of data, weee another dump and
  restore upgrade (pt!)
 
 Isn't that what pg_migrator is for?

It depends, 8.3 and 8.4 are not compatible by default (because of
--integer-datetimes). So, yeah if you are running Debian/Ubuntu but if
you are running Cent/RH with the defaults, pg_migrator isn't going to
work unless you compile Pg from source.

Joshua D. Drake

-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running vacuumdb -a taking too long

2009-07-27 Thread Martijn van Oosterhout
On Mon, Jul 27, 2009 at 02:21:02PM -0600, Keaton Adams wrote:
 We are upgrading to 8.3.7 in September, if that helps the situation at all.

This is good. Since 8.2 VACUUM age is done per table instead of per
database. This should solve most of your problems.

 So my questions are:
 
  1.  Will the Postgres cluster eventually shut down because I never
  do a true database-wide VACUUM using a vacuumdb -a command on all
  of the databases, even though I vacuum the tables in the production
  database that have a lifespan of greater than 14 days?

On older versions you need to do a database-wide vacuum (note this is
not vacuumdb -a) once every billion transaction.

Did you take the advice in the email you responded to with respect to
speeding up vacuum? And using

  SELECT datname, age(datfrozenxid) FROM pg_database;

to determine if it's an actual problem (just post the results if you
can't interpret them).

  2.  Would I ever be at risk of losing data in a table that is only
  around for a two week (14 day period) if I never do this database
  wide VACUUM on the actual production DB?

You won't lose data, but you need to do a DB wide (not cluster-wide)
vacuum to advance the wraparound counter...

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [GENERAL] Running vacuumdb -a taking too long

2009-07-27 Thread Keaton Adams
 This is good. Since 8.2 VACUUM age is done per table instead of per
 database. This should solve most of your problems.
 On older versions you need to do a database-wide vacuum (note this is
 not vacuumdb -a) once every billion transactions.
 You won't lose data, but you need to do a DB wide (not cluster-wide)
 vacuum to advance the wraparound counter...

That answered the question exactly as I needed to hear it.

 Did you take the advice in the email you responded to with respect to
 speeding up vacuum?

Great suggestion and one that I will try out over the next couple of days.

 And using
 to determine if it's an actual problem (just post the results if you
 can't interpret them).

postgres=# SELECT datname, age(datfrozenxid) FROM pg_database;
  datname   |age
+
 postgres   | 1073741878
 listenerdb | 1074114794
 template1  | 1073908727
 template0  |   30121699
(4 rows)


Thanks again.




On 7/27/09 3:10 PM, Martijn van Oosterhout klep...@svana.org wrote:

 On Mon, Jul 27, 2009 at 02:21:02PM -0600, Keaton Adams wrote:
 We are upgrading to 8.3.7 in September, if that helps the situation at all.
 
 This is good. Since 8.2 VACUUM age is done per table instead of per
 database. This should solve most of your problems.
 
 So my questions are:
 
  1.  Will the Postgres cluster eventually shut down because I never
  do a true database-wide VACUUM using a vacuumdb -a command on all
  of the databases, even though I vacuum the tables in the production
  database that have a lifespan of greater than 14 days?
 
 On older versions you need to do a database-wide vacuum (note this is
 not vacuumdb -a) once every billion transaction.
 
 Did you take the advice in the email you responded to with respect to
 speeding up vacuum? And using
 
 SELECT datname, age(datfrozenxid) FROM pg_database;
 
 to determine if it's an actual problem (just post the results if you
 can't interpret them).
 
  2.  Would I ever be at risk of losing data in a table that is only
  around for a two week (14 day period) if I never do this database
  wide VACUUM on the actual production DB?
 
 You won't lose data, but you need to do a DB wide (not cluster-wide)
 vacuum to advance the wraparound counter...
 
 Have a nice day,


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] C Function Question

2009-07-27 Thread Terry Lee Tucker
Greetings:

Does anyone know if a function written in C and linked into the backend in a 
shared library with a statically declared structure, maintain that data for 
the life of the backend process such that, when the function is called again, 
the structure data is intact?

Thanks for any insight anyone can give...
-- 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes:
 It depends, 8.3 and 8.4 are not compatible by default (because of
 --integer-datetimes). So, yeah if you are running Debian/Ubuntu but if
 you are running Cent/RH with the defaults, pg_migrator isn't going to
 work unless you compile Pg from source.

Oh?  You think RH/Cent is going to change that default now?  Think again.

Of course the real question is whether you trust pg_migrator to not eat
your data.  Those who are afraid to trust 8.4.0 will probably not care
to trust pg_migrator for a few versions yet either...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Joshua D. Drake
On Mon, 2009-07-27 at 19:16 -0400, Tom Lane wrote:
 Joshua D. Drake j...@commandprompt.com writes:
  It depends, 8.3 and 8.4 are not compatible by default (because of
  --integer-datetimes). So, yeah if you are running Debian/Ubuntu but if
  you are running Cent/RH with the defaults, pg_migrator isn't going to
  work unless you compile Pg from source.
 
 Oh?  You think RH/Cent is going to change that default now?  Think again.
 

I thought they would get around to changing it now. That is a shame
because RH really can't be used as a production PostgreSQL server (if
date based data is important) unless you recompile or install the
--integer-datetime rpms.

Joshua D. Drake

-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes:
 On Mon, 2009-07-27 at 19:16 -0400, Tom Lane wrote:
 Oh?  You think RH/Cent is going to change that default now?  Think again.

 I thought they would get around to changing it now.

They is me, and it's not changing.  I'm not blowing a chance at
in-place upgrade to switch the integer-timestamp default.

 because RH really can't be used as a production PostgreSQL server (if
 date based data is important)

I have open bugs about the lack of in-place upgrade.  I have never once
heard a customer complain about FP timestamps.  So your position is
nonsense.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] general question on two-partition table

2009-07-27 Thread Janet Jacobsen
Hi.  We have a table with 30 M records that is growing by
about 100 K records per day.

The experimentalists, whose data are in the table, have
decided that they will focus on the records for which the
value of one field, rbscore, is greater than a cut-off.
However, they want to continue to store all of the data
- even the records for which rbscore is less than the cutoff
- in the database.

For the current table, there are about 400 K (out of 30 M)
records that meet the 'good' criteria.

Each record in the table has about 40 fields, and the
experimentalists do in fact write queries that use many of
those fields, some more than others.  (They are building a
model and have not pinned down exactly which fields are
more important than others, so that's why they store and
query by so many fields.)

If they are going to spend 95% of their time querying the
records that meet the 'good' criteria, what are the good
strategies for ensuring good performance for those queries?
(1) Should I partition the table into two partitions based on
the value of rbscore?
(2) Should I create two separate tables?

Are (1) and (2) more or less equivalent in terms of
performance?

I think that partitioning the table is a more flexible option
(i.e., what if the cutoff value changes, no need to change
the name of the table being queried, etc.), but would (2)
give better performance given that 95% of their queries
are for rbscores greater than a threshold value?

Can you suggest other strategies?

Thank you,
Janet




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Building from source vs RPMs

2009-07-27 Thread Christophe Pettus
I'm moving from a long time in BSD-land to using Linux.  I've always  
been in the habit of building PostgreSQL from the source tarballs.  On  
Linux, is there an advantage either way to using the RPMs as opposed  
to building from source?  Thanks!

--
-- Christophe Pettus
   x...@thebuild.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] A question about the permissions

2009-07-27 Thread Tim Uckun
I am trying to monitor replication lag using zabbix. I have written a
simple script in ruby to get the lag it goes like this.

require 'date'
require 'yaml'

y = YAML.load `/usr/lib/postgresql/8.3/bin/pg_controldata
/var/lib/postgresql/8.3/main`
last_checkpoint = DateTime.parse( y['Time of latest checkpoint'])
hours, mins, secs, fractions = Date::day_fraction_to_time(DateTime.now
- last_checkpoint)
puts hours * 60 * 60 + mins * 60 + secs

When I try to run this script as the zabbix user (or any user other
than postgres or root) I get the error

pg_controldata: could not open file
/var/lib/postgresql/8.3/main/global/pg_control for reading:
Permission denied

This is because everything under the 8.3 directory is readable by the
postgres user only (not group permissions).

Is there any harm to giving the postgres group the same rights as the
postgres group and adding the zabbix user to the postgres group?

What is the rationale for not giving the group any permissions at all?

Thanks.

BTW I think it's really cool that the output of pg_controldata is a
YAML parseable format. I am pretty sure that's by accident but don't
change it :)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] For production: 8.4 or 8.3?

2009-07-27 Thread Joshua D. Drake
On Mon, 2009-07-27 at 19:44 -0400, Tom Lane wrote:

  because RH really can't be used as a production PostgreSQL server (if
  date based data is important)
 
 I have open bugs about the lack of in-place upgrade.  I have never once
 heard a customer complain about FP timestamps.  So your position is
 nonsense.

Most customers wouldn't even understand the problem. We have systems we
have to custom maintain due to PostgreSQL having ghost data because of
the floating point based timestamp storage. 

The problem is very simple. If you run on RH by default you have an
opportunity for data that will disappear in a practical sense. You know
this is true. My response is not nonsense. The data is still there but
it is floating point based and thus, inexact. The where clause that you
expect to retrieve the data, may not. 


Joshua D. Drake

-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] general question on two-partition table

2009-07-27 Thread David Wilson
On Mon, Jul 27, 2009 at 7:52 PM, Janet Jacobsenjsjacob...@lbl.gov wrote:

 Can you suggest other strategies?

Something that might be easier to play with is to create a (or
several, to speed up other queries) functional index on the comparison
between rbscore and the cutoff. It won't buy you anything on seq
scans, but if most of the queries are done on small subsets of the
tuples which meet the good criteria, it could be a big win that's
very easy to implement.

-- 
- David T. Wilson
david.t.wil...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] general question on two-partition table

2009-07-27 Thread Greg Stark
On Tue, Jul 28, 2009 at 1:08 AM, David Wilsondavid.t.wil...@gmail.com wrote:
 On Mon, Jul 27, 2009 at 7:52 PM, Janet Jacobsenjsjacob...@lbl.gov wrote:

 Can you suggest other strategies?

 Something that might be easier to play with is to create a (or
 several, to speed up other queries) functional index on the comparison
 between rbscore and the cutoff.

I think it would be even more interesting to have partial indexes --
ie specified with WHERE rbscore  cutoff.

I'm actually wondering if partitioning is really what you want. You
might prefer to just keep two entirely separate tables. One that has
all the data and one that has a second copy of the desirable subset.
Kind of like a materialized view of a simple query with the where
clause of rbscore  cutoff.


-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] general question on two-partition table

2009-07-27 Thread David Wilson
On Mon, Jul 27, 2009 at 8:24 PM, Greg Starkgsst...@mit.edu wrote:

 I think it would be even more interesting to have partial indexes --
 ie specified with WHERE rbscore  cutoff.

Yes- that's what I actually meant. Word got scrambled between brain
and fingers...

-- 
- David T. Wilson
david.t.wil...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Clients disconnect but query still runs

2009-07-27 Thread Robert James
Hi.  I noticed that when clients (both psql and pgAdmin) disconnect or
cancel, queries are often still running on the server.  A few questions:
1) Is there a way to reconnect and get the results?
2) Is there a way to tell postgres to automatically stop all queries when
the client who queried them disconnects?
3) Is there a way to see all queries whose clients have disconnected?
4) And finally: Why is this the behavior? Doesn't this keep some very long
queries running which drain performance but don't seem to benefit anyone?


Re: [GENERAL] general question on two-partition table

2009-07-27 Thread Stephen Frost
* Janet Jacobsen (jsjacob...@lbl.gov) wrote:
 If they are going to spend 95% of their time querying the
 records that meet the 'good' criteria, what are the good
 strategies for ensuring good performance for those queries?
 (1) Should I partition the table into two partitions based on
 the value of rbscore?
 (2) Should I create two separate tables?
 
 Are (1) and (2) more or less equivalent in terms of
 performance?

It's not clear to me what you plan here..  How would you handle (2) for
the users?  Would you construct a view across them, or expect them to
query the right table(s)?  Options, as I see them, and in the order of
'best-to-worst' wrt user friendlyness and performance, I believe, are:

1- Partitioning (with CHECK constraints and constraint_exclusion)
2- View across two tables (with appropriate WHERE clauses)
3- Functional index (as suggested by someone else)
4- separate tables (users have to figure out how to use them)
5- single table with everything

My recommendation would be #1, followed by #2.  Be sure to look up how
to do partitioning by using inheiritance in PG, and, if you need to,
look at how to implement a trigger to handle inserts on the parent
table.  Make sure you create your CHECK() constraints correctly, and
that you have constraint_exclusion enabled, and that it *works*.

 I think that partitioning the table is a more flexible option
 (i.e., what if the cutoff value changes, no need to change
 the name of the table being queried, etc.), but would (2)
 give better performance given that 95% of their queries
 are for rbscores greater than a threshold value?

If you have your partitioning set up correctly, I don't believe having
actual separate tables would be that much of a performance gain for
queries.  It would help some with inserts, tho if you know which table
to insert into, you could just insert into that child rather than the
main and using a trigger.

Enjoy,

Stephen


signature.asc
Description: Digital signature


Re: [GENERAL] Relational Algebra and Aggregate Functions

2009-07-27 Thread Robert James
Thanks for all the good replies (both on and off list).  It seems the
consensus is for me to read Christopher Date.  I found two relevant Date
books:
1) Introduction to Database Systems
http://www.amazon.com/Introduction-Database-Systems-Kannan-Swamynathan/dp/B001BVYKY4/ref=sr_1_5?ie=UTF8s=booksqid=1248742811sr=1-5
and
2) Database in Depth: Relational Theory for Practitioners
http://www.amazon.com/Database-Depth-Relational-Theory-Practitioners/dp/0596100124/ref=sr_1_7?ie=UTF8s=booksqid=1248742811sr=1-7
Any recommendations as to which? From the titles, I'd be inclined towards
the second, but not if the first is better.  One thing I'm not interested in
is polemics against SQL and lamentations on how ignorant all practitioners
are.

On Mon, Jul 27, 2009 at 2:45 PM, Jeff Davis pg...@j-davis.com wrote:

 On Sun, 2009-07-26 at 15:36 -0400, Robert James wrote:
  I'm working on improving my background database theory, to aid in
  practice.  I've found learning relational algebra to be very helpful.
   One thing which relational algebra doesn't cover is aggregate
  functions.  Can anyone recommend any papers or web pages which provide
  some good theoretical background for aggregate functions?

 When it comes to relational theory, C.J. Date is a good author. An
 Introduction To Database Systems covers pretty much everything.

 There's a formal definition of a relational algebra (including
 SUMMARIZE, which is the authors' version of an aggregate operator)
 defined with only two operators here:
 http://thethirdmanifesto.com/
 (look for Appendix A)

 Although Appendix A is not easy to understand without some basic
 familiarity with the authors' other works.

 Regards,
 Jeff Davis




Re: [GENERAL] Relational Algebra and Aggregate Functions

2009-07-27 Thread Jeff Davis
On Mon, 2009-07-27 at 21:05 -0400, Robert James wrote:
 1) Introduction to Database Systems
 http://www.amazon.com/Introduction-Database-Systems-Kannan-Swamynathan/dp/B001BVYKY4/ref=sr_1_5?ie=UTF8s=booksqid=1248742811sr=1-5
 
 and
 2) Database in Depth: Relational Theory for Practitioners
 http://www.amazon.com/Database-Depth-Relational-Theory-Practitioners/dp/0596100124/ref=sr_1_7?ie=UTF8s=booksqid=1248742811sr=1-7

I recommend #2. It's shorter and easier to read than An Introduction to
Database Systems, and I think it will answer your question about
relational theory and aggregates (see SUMMARIZE).

Appendix A is a part of Databases, Types, and The Relational Model: The
Third Manifesto. That's interesting, but it's not an easy read, it's
describing a system in formal detail.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Clients disconnect but query still runs

2009-07-27 Thread Tom Lane
Robert James srobertja...@gmail.com writes:
 Hi.  I noticed that when clients (both psql and pgAdmin) disconnect or
 cancel, queries are often still running on the server.  A few questions:
 1) Is there a way to reconnect and get the results?

No.

 2) Is there a way to tell postgres to automatically stop all queries when
 the client who queried them disconnects?

No.

 3) Is there a way to see all queries whose clients have disconnected?

No.

 4) And finally: Why is this the behavior?

It's not easy to tell whether a client has disconnected (particularly if
the network stack is unhelpful, which is depressingly often true).
Postgres will cancel a query if it gets told that the connection's been
dropped, but it will only discover this when an attempt to output to the
client fails.  It does not spend cycles looking aside to see if the
connection has dropped when it is doing something that doesn't involve
output to the client.

If your client code is polite enough to send a cancel request before
disconnecting, that should terminate the query reasonably promptly.
But just yanking the plug doesn't do that.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] A question about the permissions

2009-07-27 Thread Tom Lane
Tim Uckun timuc...@gmail.com writes:
 What is the rationale for not giving the group any permissions at all?

On lots of systems, giving group permissions is nearly as bad as giving
world permissions (eg, all the users might be in a users group).
So we don't do it by default.  If you want to poke holes in the security
of your own installation, go right ahead.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgres 8.2 database recovery Could not create relation Invalid Argument

2009-07-27 Thread Justin Alston
Novice here :). I have PostgreSQL 8.2 installed on a single board computer
running Windows XP Embedded on a Compact Flash drive - 2 databases with no
more than 2000 ro. After 10 power cycles spaced 6 mins apart, I noticed the
postgres.exe processes no longer running. I located log file (see below) and
it appears postgres attempts to recover but fails to do so. Any suggestions:

2009-07-28 04:31:01 LOG:  database system was interrupted at 2009-07-28
04:27:54 GMT Daylight Time
2009-07-28 04:31:01 LOG:  checkpoint record is at 0/7CBF58
2009-07-28 04:31:01 LOG:  redo record is at 0/7CBF58; undo record is at 0/0;
shutdown TRUE
2009-07-28 04:31:01 LOG:  next transaction ID: 0/12290; next OID: 17183
2009-07-28 04:31:01 LOG:  next MultiXactId: 1; next MultiXactOffset: 0
2009-07-28 04:31:01 LOG:  database system was not properly shut down;
automatic recovery in progress
2009-07-28 04:31:02 LOG:  redo starts at 0/7CBFA8
2009-07-28 04:31:02 FATAL:  could not create relation 1663/16403/16586:
Invalid argument
2009-07-28 04:31:02 CONTEXT:  xlog redo update: rel 1663/16403/16586; tid
35/32; new 35/33
2009-07-28 04:31:02 LOG:  startup process (PID 1484) exited with exit code 1
2009-07-28 04:31:02 LOG:  aborting startup due to startup process failure
2009-07-28 04:31:02 LOG:  logger shutting down

Thanks in advance for any insights.

Jus