Re: [GENERAL] Weird performance issue with custom function with a for loop.

2011-02-01 Thread Nicos Panayides

On 01/31/2011 08:48 PM, Tom Lane wrote:

Nicos Panayidesni...@magneta.com.cy  writes:
   

the following function takes forever to execute as is. I see 'Inserting
original actions in temporary table' and nothing after that. If i
replace orig_user_id in the FOR loop with 1811 (the same orig_user_id
passed as the function parameter) it returns immediately correctly (the
table has indices so it's very fast).
 

It seems likely that you're getting a different plan for the generic
case because that user id isn't representative of the overall average
for the column.  You could investigate by explaining a parameterized
query:

PREPARE foo (bigint) AS
   SELECT ... WHERE game_round_actions.user_id = $1 ... ;
EXPLAIN EXECUTE foo(1811);

(To really see exactly what's happening, you'd probably need to
parameterize for each of the plpgsql variables used in the query;
I'm suspicious that the BETWEEN might be contributing to the
issue as well.)

Possibly increasing the stats target for the user id column would help,
but it's hard to be sure without knowing what its distribution is like.

regards, tom lane
   
I tried the prepared statement with both $1 and 1811 for user_id and 
here's the plans I got:


Sort  (cost=51704688.71..51704689.50 rows=314 width=57)
  Sort Key: game_round_actions.action_time, game_round_actions.action_id
  -  Nested Loop  (cost=0.00..51704675.69 rows=314 width=57)
-  Seq Scan on game_round_actions  (cost=0.00..51702078.26 
rows=314 width=53)
  Filter: ((action_time = $2) AND (action_time = $3) AND 
(sub_action_id = 0) AND (user_id = $1))
-  Index Scan using PK_game_table on game_tables  
(cost=0.00..8.26 rows=1 width=12)
  Index Cond: (game_tables.table_id = 
game_round_actions.table_id)

  Filter: (game_tables.game_type_id  ANY ($4))

Sort  (cost=226660.58..226661.33 rows=300 width=57)
  Sort Key: game_round_actions.action_time, game_round_actions.action_id
  -  Nested Loop  (cost=0.00..226648.24 rows=300 width=57)
-  Index Scan using i_session on game_round_actions  
(cost=0.00..224166.97 rows=300 width=53)

  Index Cond: ((action_time = $2) AND (action_time = $3))
  Filter: (user_id = 1811)
-  Index Scan using PK_game_table on game_tables  
(cost=0.00..8.26 rows=1 width=12)
  Index Cond: (game_tables.table_id = 
game_round_actions.table_id)

  Filter: (game_tables.game_type_id  ANY ($4))


Here's the table definition:

CREATE TABLE game_round_actions
(
  table_id bigint NOT NULL,
  round_id integer NOT NULL,
  action_id integer NOT NULL,
  seat_id integer NOT NULL,
  action_desc character varying(20) NOT NULL,
  action_area character varying(100),
  amount numeric(16,6),
  action_value character varying(100),
  action_time timestamp without time zone NOT NULL DEFAULT 
CURRENT_TIMESTAMP AT TIME ZONE 'UTC',

  user_id bigint,
  sub_action_id integer NOT NULL, -- Sub action id is 0 for the root 
actions. 0 for generated actions.


  CONSTRAINT PK_game_round_actions PRIMARY KEY (table_id, round_id, 
action_id, sub_action_id),

  CONSTRAINT fk_game_round_actions_round FOREIGN KEY (table_id, round_id)
  REFERENCES game_rounds (table_id, round_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE CASCADE,
  CONSTRAINT fk_game_round_actions_user FOREIGN KEY (table_id, user_id)
  REFERENCES game_table_users (table_id, user_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
  OIDS=FALSE
);

CREATE INDEX i_session
  ON game_round_actions
  USING btree
  (action_time)
  WHERE user_id  0 AND sub_action_id = 0;


The table contains 1 655 528 000 rows (estimated) and there are about 
1 unique user_ids. The data spans about 2 years.
Shouldn't postgres realise that in both cases user_id is compared 
against a constant value and chose the same plan?


How do I increase the stats target for the column?

--
Regards,

Nicos Panayides
IT Manager

Magneta Technologies Ltd
Tel: +357 22721919, 22317400
Fax: +357 22721917
Web: http://www.magneta.eu


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Yngve Nysaeter Pettersen

Hello all,

I am in the process of migrating a system from Postgresql 8.3 to 9.0, and  
have run into a problem with the task queue systems I am using.


The task queue controls the allocation of tasks between about 1000  
processes working in parallel, and is essentially a table of


   record_id (unique)
   project_id
   task_description_id
   state (idle, started, finished)

Each project currently have about 2 million entries. My plan is to  
increase that significantly the next few months.


To avoid having the processes trample each other's queries (the first  
attempt was to select the first matching entries of the table, which  
caused one to block all other transactions), one of the steps I took was  
to select a set of idle rows at a random offset into the table from the  
project, mark them for update, then update each record's state as started.


  SELECT record_id FROM queue WHERE project_id = my_project AND state =  
idle LIMIT n OFFSET i FOR UPDATE


At present n is 100-150, i is a random value in the range 0-1.

There is, intentionally, no ordering specified, since that would just slow  
down the query, and is not necessary.


For reference, the above query is sent through Django's cursor.execute()  
call in a manual transaction block.




What I've discovered when using Postgres 9.0 is that the processes are now  
blocking every other query into this table, apparently reducing the task  
processing speed by at least a factor of 10, and increasing the load on  
the server by a similar factor, compared to when Postgres 8.3 was used.  
The problem is apparent just after starting, with only 50-100 processes  
active (startup is staggered).


Reducing n (and looping), or increasing the i range did not work.


The reason seems to be this new part of  
http://www.postgresql.org/docs/9.0/static/sql-select.html (towards the end  
of the FOR UPDATE section):


   If a LIMIT is used, locking stops once enough rows have been returned  
to satisfy the limit
   (but note that rows skipped over by OFFSET will get locked). Similarly,  
if FOR UPDATE or
   FOR SHARE is used in a cursor's query, only rows actually fetched or  
stepped past by the

   cursor will be locked.

I can't find similar text in the 8.3 or 8.4 documentation.

AFAICT, and assuming I have not misunderstood this part of the  
documentation this means that if one of my processing nodes selects a  
block of 100 entries at offset 8000 in the resulting table, then every  
other node will be blocked while the block is being processed, not just  
the nodes that would have selected the rows in the range 0 to 7999, but  
also =8100, because they cannot gain access to the rows.


Also, using FOR SHARE does not seem to solve the problem.

IMO, as a database non-expert, locking rows that were not returned as a  
result of the query is a bug. As an example, if a query selects the X last  
items in the matching rows, that is equivalent to locking the table, or  
the relevant part of it, even if the requester have no intention to modify  
those other rows.



Is there any way to avoid this problem? Or do I have to add a random  
batch_id field to the queue table in order to separate the processes'  
queries so that they do not block each other (as frequently)?


Is it possible to disable the source code causing this (that is, reverting  
the patch that introduced the problem, or changing a configuration switch)?



--
Sincerely,
Yngve N. Pettersen

Senior Developer Email: yn...@opera.com
Opera Software ASA   http://www.opera.com/
Phone:  +47 23 69 32 60  Fax:+47 23 69 24 01


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] cast problem in Postgresql 9.0.1

2011-02-01 Thread Adrian Klaver
On Monday, January 31, 2011 10:14:29 pm AI Rumman wrote:
 I migrated my DB from Postgresql 8.1 to Postgresql 9.0.1.
 
 I have a table testtab
 \d testtab
 id int,
 hours varchar
 
 When I execute the following:
 select sum(hours) from testtab
 I get cast error.

Try:
select sum(hours::int) from testtab;

 
 Then,
 
 I created following IMPLICIT CAST functions in my DB =
 
 CREATE FUNCTION pg_catalog.integer(varchar) RETURNS int4 STRICT IMMUTABLE
 LANGUAGE SQL AS 'SELECT int4in(varcharout($1));';
 CREATE CAST (varchar AS integer) WITH FUNCTION pg_catalog.integer(varchar)
 AS IMPLICIT;
 
 CREATE FUNCTION pg_catalog.smallint(varchar) RETURNS smallint STRICT
 IMMUTABLE LANGUAGE SQL AS 'SELECT int2in(varcharout($1));';
 CREATE CAST (varchar AS smallint) WITH FUNCTION
 pg_catalog.smallint(varchar) AS IMPLICIT;
 
 Now, the above query works, but
 SELECT COALESCE(hours,0) from testtab
 failed.
 
 Any idea why?

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Andy Colson

On 2/1/2011 6:32 AM, Yngve Nysaeter Pettersen wrote:

Hello all,

I am in the process of migrating a system from Postgresql 8.3 to 9.0,
and have run into a problem with the task queue systems I am using.

The task queue controls the allocation of tasks between about 1000
processes working in parallel, and is essentially a table of

record_id (unique)
project_id
task_description_id
state (idle, started, finished)

Each project currently have about 2 million entries. My plan is to
increase that significantly the next few months.

To avoid having the processes trample each other's queries (the first
attempt was to select the first matching entries of the table, which
caused one to block all other transactions), one of the steps I took was
to select a set of idle rows at a random offset into the table from the
project, mark them for update, then update each record's state as started.

SELECT record_id FROM queue WHERE project_id = my_project AND state =
idle LIMIT n OFFSET i FOR UPDATE

At present n is 100-150, i is a random value in the range 0-1.

There is, intentionally, no ordering specified, since that would just
slow down the query, and is not necessary.

For reference, the above query is sent through Django's cursor.execute()
call in a manual transaction block.



What I've discovered when using Postgres 9.0 is that the processes are
now blocking every other query into this table, apparently reducing the
task processing speed by at least a factor of 10, and increasing the
load on the server by a similar factor, compared to when Postgres 8.3
was used. The problem is apparent just after starting, with only 50-100
processes active (startup is staggered).

Reducing n (and looping), or increasing the i range did not work.


The reason seems to be this new part of
http://www.postgresql.org/docs/9.0/static/sql-select.html (towards the
end of the FOR UPDATE section):

If a LIMIT is used, locking stops once enough rows have been returned to
satisfy the limit
(but note that rows skipped over by OFFSET will get locked). Similarly,
if FOR UPDATE or
FOR SHARE is used in a cursor's query, only rows actually fetched or
stepped past by the
cursor will be locked.

I can't find similar text in the 8.3 or 8.4 documentation.

AFAICT, and assuming I have not misunderstood this part of the
documentation this means that if one of my processing nodes selects a
block of 100 entries at offset 8000 in the resulting table, then every
other node will be blocked while the block is being processed, not just
the nodes that would have selected the rows in the range 0 to 7999, but
also =8100, because they cannot gain access to the rows.

Also, using FOR SHARE does not seem to solve the problem.

IMO, as a database non-expert, locking rows that were not returned as a
result of the query is a bug. As an example, if a query selects the X
last items in the matching rows, that is equivalent to locking the
table, or the relevant part of it, even if the requester have no
intention to modify those other rows.


Is there any way to avoid this problem? Or do I have to add a random
batch_id field to the queue table in order to separate the processes'
queries so that they do not block each other (as frequently)?

Is it possible to disable the source code causing this (that is,
reverting the patch that introduced the problem, or changing a
configuration switch)?




So, if I understand correctly, you:

q = SELECT record_id FROM queue
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i FOR UPDATE
while not q.eof
update queue set state = started where record_id = x;
process record_id
update queue set state = finsihed where record_id = x;
q.next;


Might I suggest and alternative:

q = update queue set state = started
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i
RETURNING project_id;
idlist = @q;
commit;

foreach x in idlist
process record_id
begin
update queue set state = finsihed where record_id = x;
commit;



Forgive the part perl part python sudocode.  Oh, and I've never done 
this, no idea if it actually works. :-)


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Cluster table and order information

2011-02-01 Thread Dario Beraldi

Hello,

From the documentation of CLUSTER table  
(http://www.postgresql.org/docs/8.4/static/sql-cluster.html) I  
understand that clustering can be achieved by re-creating the table  
like this:


CREATE TABLE newtable AS
SELECT * FROM table ORDER BY columnlist;

My question is: If I upload with COPY a datafile which is already  
correctly sorted, can I inform postgres of such order, so that no  
clustering is necessary after the import? In other words, how can I  
tell postgres that my file is order by this and that column?


Many thanks!

Dario

--

Dr. Dario Beraldi
Institute of Evolutionary Biology
University of Edinburgh
West Mains Road
Edinburgh EH9 3JT
Scotland, UK


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Using the copy protocol, is there a way to be notified of a potential error before sending?

2011-02-01 Thread Nicolas Grilly
Is the copy protocol (aka PQputCopyData and PQputCopyEnd) designed to send
data during many minutes, and is there a way to be notified of a potential
error (for example an invalid value in one column) before sending
PQputCopyEnd? Or do I have to send my data in small chunks (for example
batch of 1 rows), issue a PQputCopyEnd, check for errors, and continue
with the next chunk?

I tried the command \copy ... from stdio in psql and it looks like psql
has to read the entire input before returning an error, even if the invalid
value is in one of the first sent rows.

Thanks for your help and advice.

Regards,

Nicolas Grilly


[GENERAL] Book recommendation?

2011-02-01 Thread Herouth Maoz
As a result of my recent encounter with table bloat and other tuning issues 
I've been running into, I'm looking for a good resource for improving my tuning 
skills.

My sysadmin ran into the following book:

PostgreSQL 9.0 High Performance, by Gregory Smith, ISBN 184951030X
http://amzn.com/184951030X

Which covers versions 8.1 through 9.

Any opinions on this book? Other suggestions?

Thank you,
Herouth
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cluster table and order information

2011-02-01 Thread Andy Colson

On 2/1/2011 9:08 AM, Dario Beraldi wrote:

Hello,

 From the documentation of CLUSTER table
(http://www.postgresql.org/docs/8.4/static/sql-cluster.html) I
understand that clustering can be achieved by re-creating the table like
this:

CREATE TABLE newtable AS
SELECT * FROM table ORDER BY columnlist;

My question is: If I upload with COPY a datafile which is already
correctly sorted, can I inform postgres of such order, so that no
clustering is necessary after the import? In other words, how can I tell
postgres that my file is order by this and that column?

Many thanks!

Dario



The planner has no knowledge of cluster.  Meaning PG will query a 
clustered and unclustered table exactly the same way.  A table is not 
marked or anything as clustered.  And in fact, during usage of a table 
it'll become unclustered.


Clustering is only useful when you are going to read multiple records in 
the same order as an index.   It turns more random seeks into more 
sequential reads.


If your COPY loads data in indexed order, then just dont run the cluster.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Book recommendation?

2011-02-01 Thread tv
 As a result of my recent encounter with table bloat and other tuning
 issues I've been running into, I'm looking for a good resource for
 improving my tuning skills.

 My sysadmin ran into the following book:

 PostgreSQL 9.0 High Performance, by Gregory Smith, ISBN 184951030X
 http://amzn.com/184951030X

 Which covers versions 8.1 through 9.

 Any opinions on this book? Other suggestions?

100% positive - go and buy it. It's very thorough, it covers a lot of
associated topics (hw, filesystem, ...) and it's fresh. There were several
reviews at planet.postgresql.org and all of them weve very positive.

regards
Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Book recommendation?

2011-02-01 Thread Jeff Ross

On 02/01/11 08:55, Herouth Maoz wrote:

As a result of my recent encounter with table bloat and other tuning issues 
I've been running into, I'm looking for a good resource for improving my tuning 
skills.

My sysadmin ran into the following book:

PostgreSQL 9.0 High Performance, by Gregory Smith, ISBN 184951030X
http://amzn.com/184951030X

Which covers versions 8.1 through 9.

Any opinions on this book? Other suggestions?

Thank you,
Herouth


I bought this when it first came out.  I bought the pdf version directly 
from Packt bundled with the PostgreSQL 9 Admin Cookbook for an amazingly 
low price.


I doubt you'll find a better book resource for recent versions of 
PostgreSQL.  Both are  well and clearly written, and cover a lot of 
ground in great detail.


Greg Smith has helped me and countless others on this and the 
Performance list and he is truly an authoritative resource.


Jeff

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PD_ALL_VISIBLE flag was incorrectly set in relation pg_statistic

2011-02-01 Thread bricklen
On Mon, Jan 31, 2011 at 2:55 PM, bricklen brick...@gmail.com wrote:
 We just had a slew of the following messages in our log. How concerned
 should I be at this point? I have no idea what triggered it.

 Version:
 PostgreSQL 9.0.2 on x86_64-unknown-linux-gnu, compiled by GCC gcc
 (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit

 2011-01-31 14:16:13 PST [32767]: [1-1] (user=) (rhost=) WARNING:
 PD_ALL_VISIBLE flag was incorrectly set in relation pg_statistic
 page 22
...

Are there any other details I can provide to help diagnose this issue?
It is a production server, so I'd like to rule out the likelihood of
corruption etc if possible. There were about 3400 entries related to
that message yesterday, referencing two tables (pg_statistics and a
user table). The messages haven't reoccurred, and I've run a couple of
cluster-wide vacuum analyzes to see if I could shake anything else
out.

Cheers,

Bricklen

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Problem with encode () and hmac() in pgcrypto

2011-02-01 Thread hlcborg

Hi,

I am trying to compute a hash string with 172 bytes. 
I have a openssl RSA private key stored in the postgresql db and a string
with some text.

I want to sign that string with the private key and the SHA1 algorithm and
encode the result in the base64 format.

I am writing it inside the pg database in a function that performs other
stuff.

String to sign:

v_em_crt_conc = '2011-01-31;2011-02-01T13:33:38;100036;684.40; '

Private RSA Key:
This Key was genarated in OpenSSL with this command: 
~$openssl genrsa -out abc.pem 1024 and then copied the content to v_Private

v_Private =
'MIICXAIBAAKBgQCfaFpbOjsz0fKygWc9zdvu1XjOSJEZJ9XwlAyayxt2A57OUjxJ
GJZBwZDLWx+rYga2B04v5MigY9q/TfrSzbQZpmPf6hC/U36y7pgPce1ijosl7/on
4UejIlLAcqJgkoJVrkaRp0NuX5hcmFiR2z8b+ypHtF3t/JOHkz+gUAmc7wIDAQAB
AoGBAIuNcbz356hgHpaDjvyeYfJe1FnTUaOyKitpkPda5HmWhcqeV8SuT6zVZouB
BOKm+LUXBC4Nnk473N8px3IRP57rCaLwFdQCrfVMieAkdVPoLIryofo81jF4bbOo
yUJ1E901lCbSW8bnPhrWz1zFVWBUHoik2aWIiETs8v7HD8RBAkEA0YKX0SngYo6J
5BX8en9kG0cpE/oXrTOP5cTXuyOK/bG2pikVXfozSIZx3rbszAtAjSH72Mi/djLo
WsbEKL0KEQJBAMLHpLOssg06Hyj6XYGi8l6Yhxy2Vsv6qtwbxZ3soysuZrcmxgNm
ek1wVyDsuTKRPQYcFgyi2fhaucXPKgzNNv8CQAIqDDFv8k5yVIBTrdECIeGaQZVg
PwBeCP2BpAzd2CC4xj8/K2ZWIFN+eAyp7RZKg3cxw61AnaG18uqFJR+anFECQB8b
yadUpp9MMBvYWMTSpIkNLU8dpCEx6MX8vYQqfijwYUFRNQDoBhKGJbRBepaj8a4f
IDpmbA+pVdBM9PdUczsCQGrwaHYQw0NhUz2WIGvog8HUbkyAPLiWLpwWvwpjoAFs
LcoEyU/UxuNTwGSL3o5aYCo1RFCZH0sBnhd4S0DH0JI='

The FUNCTION:

encode(hmac(v_em_crt_conc, v_Private,'sha1'),'base64');

The Result:

h6CpmrP1QCE/Mp3xn3utUEPtftg=  This hash has 28 chars 


When I use OpenSSL in command line like this:

~$ echo 2011-01-31;2011-02-01T13:33:38;100036;684.40;  | openssl dgst
-sha1 -sign abc.pem | openssl enc -base64 -A

The Result is:

nKfxnt31+kk/RnKihJ0jKufq+nZvmPjVauGo8+tqJ1Y/ah/mAu4jSS1wnzU+wRygZ4CLIV9DGSs9bxBc4r9e71C8s9B5ms6Kpggmc12kdmqVHBRO28bPWb/YLCej59gZFFkvcCudweNAT4qHvVqWsOtFCf9kE4q92UIv1JcwSDU=

This hash has 172 chars 

Does someone know where is my problem? Is there other way to implement?  or
is it simply impossible??

Thanks in advance

Luis




-- 
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Problem-with-encode-and-hmac-in-pgcrypto-tp3366420p3366420.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cluster table and order information

2011-02-01 Thread Andy Colson

On 2/1/2011 10:17 AM, Dario Beraldi wrote:

Quoting Andy Colson a...@squeakycode.net:


On 2/1/2011 9:08 AM, Dario Beraldi wrote:

Hello,

From the documentation of CLUSTER table
(http://www.postgresql.org/docs/8.4/static/sql-cluster.html) I
understand that clustering can be achieved by re-creating the table like
this:

CREATE TABLE newtable AS
SELECT * FROM table ORDER BY columnlist;

My question is: If I upload with COPY a datafile which is already
correctly sorted, can I inform postgres of such order, so that no
clustering is necessary after the import? In other words, how can I tell
postgres that my file is order by this and that column?

Many thanks!

Dario



The planner has no knowledge of cluster. Meaning PG will query a
clustered and unclustered table exactly the same way. A table is not
marked or anything as clustered. And in fact, during usage of a table
it'll become unclustered.

Clustering is only useful when you are going to read multiple records
in the same order as an index. It turns more random seeks into more
sequential reads.

If your COPY loads data in indexed order, then just dont run the cluster.

-Andy


Thanks very much Andy, this clarifies my doubts.

I was misled by the docs saying When a table is clustered, PostgreSQL
remembers which index it was clustered by which made me think that the
order information is stored somewhere.

All the best
Dario



The next sentience clears it up:

The form CLUSTER table_name reclusters the table using the same index 
as before.



-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Weird performance issue with custom function with a for loop.

2011-02-01 Thread Tom Lane
Nicos Panayides ni...@magneta.com.cy writes:
 On 01/31/2011 08:48 PM, Tom Lane wrote:
 It seems likely that you're getting a different plan for the generic
 case because that user id isn't representative of the overall average
 for the column.

 I tried the prepared statement with both $1 and 1811 for user_id and 
 here's the plans I got:

[ bad ]
 -  Seq Scan on game_round_actions  (cost=0.00..51702078.26 
 rows=314 width=53)
   Filter: ((action_time = $2) AND (action_time = $3) AND 
 (sub_action_id = 0) AND (user_id = $1))

[ good ]
 -  Index Scan using i_session on game_round_actions  
 (cost=0.00..224166.97 rows=300 width=53)
   Index Cond: ((action_time = $2) AND (action_time = $3))
   Filter: (user_id = 1811)

So the question is why it won't use that index in the parameterized case ...

 CREATE INDEX i_session
ON game_round_actions
USING btree
(action_time)
WHERE user_id  0 AND sub_action_id = 0;

... and the answer is that it can't prove user_id  0 when it doesn't
know the value of the parameter equated to user_id, so it cannot build
a plan that relies on using that partial index.  (IOW, if it did use the
index, it would get the wrong answer if $1 happened to be zero.)

I don't know the reason you had for making the index partial in the
first place, but maybe you should reconsider that.  Another possibility
is to explicitly include user_id  0 in the query conditions, if
you're certain that the passed-in value is never zero.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cluster table and order information

2011-02-01 Thread Dario Beraldi

Quoting Andy Colson a...@squeakycode.net:


On 2/1/2011 9:08 AM, Dario Beraldi wrote:

Hello,

From the documentation of CLUSTER table
(http://www.postgresql.org/docs/8.4/static/sql-cluster.html) I
understand that clustering can be achieved by re-creating the table like
this:

CREATE TABLE newtable AS
SELECT * FROM table ORDER BY columnlist;

My question is: If I upload with COPY a datafile which is already
correctly sorted, can I inform postgres of such order, so that no
clustering is necessary after the import? In other words, how can I tell
postgres that my file is order by this and that column?

Many thanks!

Dario



The planner has no knowledge of cluster.  Meaning PG will query a  
clustered and unclustered table exactly the same way.  A table is  
not marked or anything as clustered.  And in fact, during usage of a  
table it'll become unclustered.


Clustering is only useful when you are going to read multiple  
records in the same order as an index.   It turns more random  
seeks into more sequential reads.


If your COPY loads data in indexed order, then just dont run the cluster.

-Andy


Thanks very much Andy, this clarifies my doubts.

I was misled by the docs saying When a table is clustered, PostgreSQL  
remembers which index it was clustered by which made me think that  
the order information is stored somewhere.


All the best
Dario

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Yngve Nysaeter Pettersen

Hi,

Thanks for the quick answer, Andy.

On Tue, 01 Feb 2011 16:19:17 +0100, Andy Colson a...@squeakycode.net  
wrote:


snip

So, if I understand correctly, you:

q = SELECT record_id FROM queue
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i FOR UPDATE
while not q.eof
update queue set state = started where record_id = x;
process record_id
update queue set state = finsihed where record_id = x;
q.next;


Almost, the update to started is done for all selected elements first,  
releasing the lock, then the items are processed one at a time, marking  
each finished as they complete. (each processing step can take minutes,  
so keeping a lock the whole time is not an option)



Might I suggest and alternative:

q = update queue set state = started
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i
RETURNING project_id;
idlist = @q;
commit;

foreach x in idlist
process record_id
begin
update queue set state = finsihed where record_id = x;
commit;

Forgive the part perl part python sudocode.  Oh, and I've never done  
this, no idea if it actually works. :-)


Thanks for that suggestion, I'll take a look at it.

While I hadn't caught on to the RETURNING part, I had been wondering if  
using a single step UPDATE might be a solution. One concern I have is how  
concurrent updates will affect the returned list (or if they will just be  
skipped, as SELECT would in normal transaction mode, if I understood  
correctly), or whether it might return with an error code (I know that the  
normal update return value is the number of updated items, just not sure  
if that applies for RETURNING).


Although, I will note that this process (if it works) will, sort of, make  
FOR UPDATE redundant. Or, if it doesn't, the current lock-policy might  
cause issues for concurrent updates for the use-cases where FOR UPDATE is  
relevant.


--
Sincerely,
Yngve N. Pettersen

Senior Developer Email: yn...@opera.com
Opera Software ASA   http://www.opera.com/
Phone:  +47 23 69 32 60  Fax:+47 23 69 24 01


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Andy Colson

On 2/1/2011 10:59 AM, Yngve Nysaeter Pettersen wrote:

Hi,

Thanks for the quick answer, Andy.

On Tue, 01 Feb 2011 16:19:17 +0100, Andy Colson a...@squeakycode.net
wrote:

snip

So, if I understand correctly, you:

q = SELECT record_id FROM queue
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i FOR UPDATE
while not q.eof
update queue set state = started where record_id = x;
process record_id
update queue set state = finsihed where record_id = x;
q.next;


Almost, the update to started is done for all selected elements first,
releasing the lock, then the items are processed one at a time, marking
each finished as they complete. (each processing step can take
minutes, so keeping a lock the whole time is not an option)


Might I suggest and alternative:

q = update queue set state = started
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i
RETURNING project_id;
idlist = @q;
commit;

foreach x in idlist
process record_id
begin
update queue set state = finsihed where record_id = x;
commit;

Forgive the part perl part python sudocode. Oh, and I've never done
this, no idea if it actually works. :-)


Thanks for that suggestion, I'll take a look at it.

While I hadn't caught on to the RETURNING part, I had been wondering
if using a single step UPDATE might be a solution. One concern I have is
how concurrent updates will affect the returned list (or if they will
just be skipped, as SELECT would in normal transaction mode, if I
understood correctly), or whether it might return with an error code (I
know that the normal update return value is the number of updated items,
just not sure if that applies for RETURNING).

Although, I will note that this process (if it works) will, sort of,
make FOR UPDATE redundant. Or, if it doesn't, the current lock-policy
might cause issues for concurrent updates for the use-cases where FOR
UPDATE is relevant.



Yeah, I'd wondered the same thing.  It could be two updates hitting the 
same row will deadlock, or maybe not, I'm not sure.  But I think its the 
same as with the select, if you happen to have two limits that hit the 
same range, you're in trouble.


I think the random limit thing is a race condition itself.  Whenever you 
have multiple processes hitting the same rows you're going to run into 
problems.  Have you thought of using a sequence instead of a random 
limit?  Each process could get the next 100 record_id'd via  a sequence, 
then there would be much less chance of deadlock.


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Tom Lane
Yngve Nysaeter Pettersen yn...@opera.com writes:
 To avoid having the processes trample each other's queries (the first  
 attempt was to select the first matching entries of the table, which  
 caused one to block all other transactions), one of the steps I took was  
 to select a set of idle rows at a random offset into the table from the  
 project, mark them for update, then update each record's state as started.

SELECT record_id FROM queue WHERE project_id = my_project AND state =  
 idle LIMIT n OFFSET i FOR UPDATE

 At present n is 100-150, i is a random value in the range 0-1.

 There is, intentionally, no ordering specified, since that would just slow  
 down the query, and is not necessary.

This seems like a pretty bad design.  There are recognized ways to solve
this problem with more predictability and much less chance of different
processes blocking each other.  In particular, this query seems be based
on some untenable assumptions about the physical row order being stable.

 What I've discovered when using Postgres 9.0 is that the processes are now  
 blocking every other query into this table,

In 9.0, LIMIT/OFFSET processing is done after FOR UPDATE locking, which
means that rows skipped over by OFFSET still get locked, which means
that different sessions executing this query are now practically certain
to block each other, rather than just likely to block each other.
This was an intentional change to improve the predictability of FOR
UPDATE's interactions with LIMIT/OFFSET, and indeed it's improved the
predictability of the behavior for you, just not in the direction you'd
like :-(

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Book recommendation?

2011-02-01 Thread Vick Khera
On Tue, Feb 1, 2011 at 11:14 AM, Jeff Ross jr...@wykids.org wrote:
 I doubt you'll find a better book resource for recent versions of
 PostgreSQL.  Both are  well and clearly written, and cover a lot of ground
 in great detail.

 Greg Smith has helped me and countless others on this and the Performance
 list and he is truly an authoritative resource.


I've hired Greg for performance tuning work, and I must say he
*really* knows what he's doing. I've read some chapters in the book
and he seems to get across his knowledge very well into those pages.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Yngve Nysaeter Pettersen

On Tue, 01 Feb 2011 18:18:17 +0100, Tom Lane t...@sss.pgh.pa.us wrote:


Yngve Nysaeter Pettersen yn...@opera.com writes:

To avoid having the processes trample each other's queries (the first
attempt was to select the first matching entries of the table, which
caused one to block all other transactions), one of the steps I took was
to select a set of idle rows at a random offset into the table from the
project, mark them for update, then update each record's state as  
started.



   SELECT record_id FROM queue WHERE project_id = my_project AND state =
idle LIMIT n OFFSET i FOR UPDATE



At present n is 100-150, i is a random value in the range 0-1.


There is, intentionally, no ordering specified, since that would just  
slow

down the query, and is not necessary.


This seems like a pretty bad design.


Well, I don't claim to be a database expert ;).

While there might be better ways, the current one have worked OK in the  
year since it was implemented.



There are recognized ways to solve
this problem with more predictability and much less chance of different


I'd appreciate it if you could provide a couple of pointers.


processes blocking each other.  In particular, this query seems be based
on some untenable assumptions about the physical row order being stable.


No, it does not assume that the row order is stable; I don't really care  
about the order of the elements, since the actual order of task execution  
depends much more significantly on other variables, and the actual order  
isn't important at all (although further design changes might impose some  
limited category grouping on the queue, that would still not make the  
ordering important within the group).


What I've discovered when using Postgres 9.0 is that the processes are  
now

blocking every other query into this table,


In 9.0, LIMIT/OFFSET processing is done after FOR UPDATE locking, which
means that rows skipped over by OFFSET still get locked, which means
that different sessions executing this query are now practically certain
to block each other, rather than just likely to block each other.
This was an intentional change to improve the predictability of FOR
UPDATE's interactions with LIMIT/OFFSET, and indeed it's improved the
predictability of the behavior for you, just not in the direction you'd
like :-(


That might be, but is is necessary to continue locking (which is what it  
sounds like to me) the elements that are not used in the final response  
past completing the query?


What happens now, if I understand it correctly, is that if a select foo  
from bar limit 1 order by whatever offset tablelen-1 for update is  
performed, the effective operation is also LOCK bar, not just a row lock  
on item tablelen-1 in that table. Was that the intention?  (and yes, I am  
aware that ordering might be used to reverse that sequence so offset 0 can  
be used, but wouldn't that just as much block the query for offset 1?)



--
Sincerely,
Yngve N. Pettersen

Senior Developer Email: yn...@opera.com
Opera Software ASA   http://www.opera.com/
Phone:  +47 23 69 32 60  Fax:+47 23 69 24 01


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Yngve Nysaeter Pettersen
On Tue, 01 Feb 2011 18:11:23 +0100, Andy Colson a...@squeakycode.net  
wrote:



On 2/1/2011 10:59 AM, Yngve Nysaeter Pettersen wrote:

Hi,

Thanks for the quick answer, Andy.

On Tue, 01 Feb 2011 16:19:17 +0100, Andy Colson a...@squeakycode.net
wrote:

snip

So, if I understand correctly, you:

q = SELECT record_id FROM queue
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i FOR UPDATE
while not q.eof
update queue set state = started where record_id = x;
process record_id
update queue set state = finsihed where record_id = x;
q.next;


Almost, the update to started is done for all selected elements first,
releasing the lock, then the items are processed one at a time, marking
each finished as they complete. (each processing step can take
minutes, so keeping a lock the whole time is not an option)


Might I suggest and alternative:

q = update queue set state = started
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i
RETURNING project_id;
idlist = @q;
commit;

foreach x in idlist
process record_id
begin
update queue set state = finsihed where record_id = x;
commit;

Forgive the part perl part python sudocode. Oh, and I've never done
this, no idea if it actually works. :-)


Thanks for that suggestion, I'll take a look at it.

While I hadn't caught on to the RETURNING part, I had been wondering
if using a single step UPDATE might be a solution. One concern I have is
how concurrent updates will affect the returned list (or if they will
just be skipped, as SELECT would in normal transaction mode, if I
understood correctly), or whether it might return with an error code (I
know that the normal update return value is the number of updated items,
just not sure if that applies for RETURNING).

Although, I will note that this process (if it works) will, sort of,
make FOR UPDATE redundant. Or, if it doesn't, the current lock-policy
might cause issues for concurrent updates for the use-cases where FOR
UPDATE is relevant.



Yeah, I'd wondered the same thing.  It could be two updates hitting the  
same row will deadlock, or maybe not, I'm not sure.  But I think its the  
same as with the select, if you happen to have two limits that hit the  
same range, you're in trouble.


I think the random limit thing is a race condition itself.  Whenever you  
have multiple processes hitting the same rows you're going to run into  
problems.  Have you thought of using a sequence instead of a random  
limit?  Each process could get the next 100 record_id'd via  a sequence,  
then there would be much less chance of deadlock.


How would that work, in case you would like to provide an example?

I am not really familiar with sequences, as I have only seen them used for  
the id field in Django generated tables.


In case it is relevant, the processes does not (currently, at least) have  
a unique ID; though they have a local sequence number for the machine they  
are running on.



--
Sincerely,
Yngve N. Pettersen

Senior Developer Email: yn...@opera.com
Opera Software ASA   http://www.opera.com/
Phone:  +47 23 69 32 60  Fax:+47 23 69 24 01


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Book recommendation?

2011-02-01 Thread Chris Browne
hero...@unicell.co.il (Herouth Maoz) writes:
 As a result of my recent encounter with table bloat and other tuning
 issues I've been running into, I'm looking for a good resource for
 improving my tuning skills.

 My sysadmin ran into the following book:

 PostgreSQL 9.0 High Performance, by Gregory Smith, ISBN 184951030X
 http://amzn.com/184951030X

 Which covers versions 8.1 through 9.

 Any opinions on this book? Other suggestions?

It's the best thing in that vein that is available, and is rather good.

The only thing that's *somewhat* comparable is _PostgreSQL_ by Korry
Douglas  Susan Douglas, which, alas, is now 7 years old, and thus
rather dated.  It was the one reference that had substantive material on
query planning, though that's likely somewhat less relevant to you.

You should certainly take a look at Greg Smith's book.
-- 
(reverse (concatenate 'string moc.liamg @ enworbbc))
http://linuxdatabases.info/info/finances.html
what would  we do without C?  we   would have PASAL,  BASI, OBOL, and
Ommon Lisp. -- #Erik

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Andy Colson

On 2/1/2011 12:10 PM, Yngve Nysaeter Pettersen wrote:

On Tue, 01 Feb 2011 18:11:23 +0100, Andy Colson a...@squeakycode.net

I think the random limit thing is a race condition itself. Whenever
you have multiple processes hitting the same rows you're going to run
into problems. Have you thought of using a sequence instead of a
random limit? Each process could get the next 100 record_id'd via a
sequence, then there would be much less chance of deadlock.


How would that work, in case you would like to provide an example?

I am not really familiar with sequences, as I have only seen them used
for the id field in Django generated tables.

In case it is relevant, the processes does not (currently, at least)
have a unique ID; though they have a local sequence number for the
machine they are running on.




I have a really simple q table I use.

create table q (id integer not null, msg integer, primary key(id));
create sequence q_add;
create sequence q_read;

I insert via q_add:

andy=# insert into q(id, msg) values(nextval('q_add'), 20);
INSERT 0 1
andy=# insert into q(id, msg) values(nextval('q_add'), 4);
INSERT 0 1
andy=# select * from q;
 id | msg
+-
  1 |  20
  2 |   4
(2 rows)


Then I run multiple batch proc's which get their next job like:

andy=# select msg from q where id = (select nextval('q_read'));
 msg
-
  20
(1 row)

andy=# select msg from q where id = (select nextval('q_read'));
 msg
-
   4
(1 row)


It works for me because I can empty the q table, reset the q_add and 
q_read sequences and start over clean.  Not sure if it would work for 
your setup.



-Andy




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Weird performance issue with custom function with a for loop.

2011-02-01 Thread Nicos Panayides

On 02/01/2011 06:52 PM, Tom Lane wrote:

Nicos Panayidesni...@magneta.com.cy  writes:
   

On 01/31/2011 08:48 PM, Tom Lane wrote:
 

It seems likely that you're getting a different plan for the generic
case because that user id isn't representative of the overall average
for the column.
   
   

I tried the prepared statement with both $1 and 1811 for user_id and
here's the plans I got:
 

[ bad ]
   

-   Seq Scan on game_round_actions  (cost=0.00..51702078.26
rows=314 width=53)
  Filter: ((action_time= $2) AND (action_time= $3) AND
(sub_action_id = 0) AND (user_id = $1))
 

[ good ]
   

-   Index Scan using i_session on game_round_actions
(cost=0.00..224166.97 rows=300 width=53)
  Index Cond: ((action_time= $2) AND (action_time= $3))
  Filter: (user_id = 1811)
 

So the question is why it won't use that index in the parameterized case ...

   

CREATE INDEX i_session
ON game_round_actions
USING btree
(action_time)
WHERE user_id  0 AND sub_action_id = 0;
 

... and the answer is that it can't prove user_id  0 when it doesn't
know the value of the parameter equated to user_id, so it cannot build
a plan that relies on using that partial index.  (IOW, if it did use the
index, it would get the wrong answer if $1 happened to be zero.)

I don't know the reason you had for making the index partial in the
first place, but maybe you should reconsider that.  Another possibility
is to explicitly include user_id  0 in the query conditions, if
you're certain that the passed-in value is never zero.

regards, tom lane
   

I added an additional user_id  0 and performance is normal.
I didn't know that the database does not use the value of the parameter 
to chose a plan. It makes sense now that

you have explained it.

The reason we use a partial index is that the table is very huge and we 
are really interested only in those rows. Actually the condition should 
have used user_id IS NOT NULL instead of  0.
Unfortunately the index takes quite a few hours to build so we 
implemented the workaround you suggested for now. If the condition was 
user_id IS NOT NULL AND sub_action_id=0 would postgres always chose 
the index since user_id = NULL will always be false?


Thank you very much for help!

--
Regards,

Nicos Panayides
IT Manager

Magneta Technologies Ltd
Tel: +357 22721919, 22317400
Fax: +357 22721917
Web: http://www.magneta.eu


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Yngve Nysaeter Pettersen


Thanks Andy,

On Tue, 01 Feb 2011 19:29:08 +0100, Andy Colson a...@squeakycode.net
wrote:


On 2/1/2011 12:10 PM, Yngve Nysaeter Pettersen wrote:

On Tue, 01 Feb 2011 18:11:23 +0100, Andy Colson a...@squeakycode.net

I think the random limit thing is a race condition itself. Whenever
you have multiple processes hitting the same rows you're going to run
into problems. Have you thought of using a sequence instead of a
random limit? Each process could get the next 100 record_id'd via a
sequence, then there would be much less chance of deadlock.


How would that work, in case you would like to provide an example?

I am not really familiar with sequences, as I have only seen them used
for the id field in Django generated tables.

In case it is relevant, the processes does not (currently, at least)
have a unique ID; though they have a local sequence number for the
machine they are running on.




I have a really simple q table I use.

 create table q (id integer not null, msg integer, primary key(id));
 create sequence q_add;
 create sequence q_read;

I insert via q_add:

andy=# insert into q(id, msg) values(nextval('q_add'), 20);
INSERT 0 1
andy=# insert into q(id, msg) values(nextval('q_add'), 4);
INSERT 0 1
andy=# select * from q;
  id | msg
+-
   1 |  20
   2 |   4
(2 rows)


Then I run multiple batch proc's which get their next job like:

andy=# select msg from q where id = (select nextval('q_read'));
  msg
-
   20
(1 row)

andy=# select msg from q where id = (select nextval('q_read'));
  msg
-
4
(1 row)


It works for me because I can empty the q table, reset the q_add and  
q_read sequences and start over clean.  Not sure if it would work for  
your setup.


I see how that would work (it is essentially how Django assigns row ids).

My current setup can have multiple runs configured at a time (and have had
several dozen queued, in one case), with varying priorities on each run,
and they might, at least theoretically, be configured in parallel (even
the individual runs are set up in parallel), meaning the ids would not be
sequential (a sequence is used for the id field in each row of the table),
unless they could somehow be allocated for each individual run/project
(multiple sequence objects, one for each run might be an option, but I
don't like that possibility). And as I mentioned elsewhere in the thread I
might make the queuing a bit more complex, which might make this system
even more complicated.

So, AFAICT I am afraid it would not work in the general case for my  
project :( .

However, it might be useful in somebody else's project :) .

--
Sincerely,
Yngve N. Pettersen

Senior Developer Email: yn...@opera.com
Opera Software ASA   http://www.opera.com/
Phone:  +47 23 69 32 60  Fax:+47 23 69 24 01


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Andy Colson

On 2/1/2011 12:51 PM, Yngve Nysaeter Pettersen wrote:


Thanks Andy,

On Tue, 01 Feb 2011 19:29:08 +0100, Andy Colson a...@squeakycode.net
wrote:


On 2/1/2011 12:10 PM, Yngve Nysaeter Pettersen wrote:

On Tue, 01 Feb 2011 18:11:23 +0100, Andy Colson a...@squeakycode.net

I think the random limit thing is a race condition itself. Whenever
you have multiple processes hitting the same rows you're going to run
into problems. Have you thought of using a sequence instead of a
random limit? Each process could get the next 100 record_id'd via a
sequence, then there would be much less chance of deadlock.


How would that work, in case you would like to provide an example?

I am not really familiar with sequences, as I have only seen them used
for the id field in Django generated tables.

In case it is relevant, the processes does not (currently, at least)
have a unique ID; though they have a local sequence number for the
machine they are running on.




I have a really simple q table I use.

create table q (id integer not null, msg integer, primary key(id));
create sequence q_add;
create sequence q_read;

I insert via q_add:

andy=# insert into q(id, msg) values(nextval('q_add'), 20);
INSERT 0 1
andy=# insert into q(id, msg) values(nextval('q_add'), 4);
INSERT 0 1
andy=# select * from q;
id | msg
+-
1 | 20
2 | 4
(2 rows)


Then I run multiple batch proc's which get their next job like:

andy=# select msg from q where id = (select nextval('q_read'));
msg
-
20
(1 row)

andy=# select msg from q where id = (select nextval('q_read'));
msg
-
4
(1 row)


It works for me because I can empty the q table, reset the q_add and
q_read sequences and start over clean. Not sure if it would work for
your setup.


I see how that would work (it is essentially how Django assigns row ids).

My current setup can have multiple runs configured at a time (and have had
several dozen queued, in one case), with varying priorities on each run,
and they might, at least theoretically, be configured in parallel (even
the individual runs are set up in parallel), meaning the ids would not be
sequential (a sequence is used for the id field in each row of the table),
unless they could somehow be allocated for each individual run/project
(multiple sequence objects, one for each run might be an option, but I
don't like that possibility). And as I mentioned elsewhere in the thread I
might make the queuing a bit more complex, which might make this system
even more complicated.

So, AFAICT I am afraid it would not work in the general case for my
project :( .
However, it might be useful in somebody else's project :) .



No, I didn't think it would work for you, yours looks much more 
complicated than main.  Just out of curiosity, have you looked at PgQ?


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Radosław Smogura
Hmm...

May I ask how this look in details. If e.g. I do select * from myeshop offset 
100 limit 20, I have 1000 rows which rows will be locked?

a) 0 to 120, or
b) all rows will be locked.?

Kind regards,
Radek

Tom Lane t...@sss.pgh.pa.us Tuesday 01 February 2011 18:18:17
 In 9.0, LIMIT/OFFSET processing is done after FOR UPDATE locking, which
 means that rows skipped over by OFFSET still get locked, which means
 that different sessions executing this query are now practically certain
 to block each other, rather than just likely to block each other.
 This was an intentional change to improve the predictability of FOR
 UPDATE's interactions with LIMIT/OFFSET, and indeed it's improved the
 predictability of the behavior for you, just not in the direction you'd
 like :-(
 
   regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread Yngve Nysaeter Pettersen
On Tue, 01 Feb 2011 20:04:31 +0100, Andy Colson a...@squeakycode.net  
wrote:



On 2/1/2011 12:51 PM, Yngve Nysaeter Pettersen wrote:


So, AFAICT I am afraid it would not work in the general case for my
project :( .
However, it might be useful in somebody else's project :) .



No, I didn't think it would work for you, yours looks much more  
complicated than main.  Just out of curiosity, have you looked at PgQ?


I did look around for some queuing systems a year ago, I am not sure if  
that one crossed my path, but didn't find any that I thought would work  
for me, which might just be due to the fact that I had just started with  
database programming (which was also the reason I chose a framework like  
Django for most of it; the FOR UPDATE SQL is one of less than 10 locations  
where I use raw SQL in my system, because Django could not provide the  
functionality) and I just did not realize that it could help me.


Regarding PgQ, based on a quick skimming I am not sure how it would fit in  
my case. This may be because the tutorial leaves (IMO) a bit too much up  
in the air regarding how the system it is working in is organized, at  
least for a relative beginner as myself, and also not how a similar  
alternative system would look. A small complete example showing all the  
tables involved, the client(s), the server(s), and the operations  
performed, might have helped.




--
Sincerely,
Yngve N. Pettersen

Senior Developer Email: yn...@opera.com
Opera Software ASA   http://www.opera.com/
Phone:  +47 23 69 32 60  Fax:+47 23 69 24 01


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Book recommendation?

2011-02-01 Thread Scott Marlowe
On Tue, Feb 1, 2011 at 8:56 AM, Herouth Maoz hero...@unicell.co.il wrote:
 My sysadmin ran into the following book:

 PostgreSQL 9.0 High Performance, by Gregory Smith, ISBN 184951030X
 http://amzn.com/184951030X

highly recommended.  Also take a look at the pg admin cookbook from packt.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] yum repo problem

2011-02-01 Thread John R Pierce
i have an older CentOS 4.8 server that is running 8.4.2 and wanted to 
update it.   it was installed via the older yum repo pre the reorgs...


# rpm -qa |grep postgres
postgresql-server-8.4.2-1PGDG.rhel4
postgresql-jdbc-8.3.605-1PGDG.rhel4
postgresql-libs-8.4.2-1PGDG.rhel4
postgresql-8.4.2-1PGDG.rhel4
compat-postgresql-libs-4-1PGDG.rhel4
postgresql-contrib-8.4.2-1PGDG.rhel4



I fetch the updated yum.pgrpms.org repo file, but when I try and update 
postgresql-server, I'm getting...


# rpm -ivh http://yum.pgrpms.org/reporpms/8.4/pgdg-centos-8.4-2.noarch.rpm


# yum update postgresql-server
Setting up Update Process
Setting up repositories
Reading repository metadata in from local files
Resolving Dependencies
-- Populating transaction set with selected packages. Please wait.
--- Package postgresql-server.i386 0:8.4.4-2PGDG.el4 set to be updated
-- Running transaction check
-- Processing Dependency: postgresql = 8.4.4-2PGDG.el4 for package: 
postgresql-server

-- Restarting Dependency Resolution with new changes.
-- Populating transaction set with selected packages. Please wait.
--- Package postgresql.i386 0:8.4.4-2PGDG.el4 set to be updated
-- Running transaction check
-- Processing Dependency: postgresql-libs = 8.4.4-2PGDG.el4 for 
package: postgresql
-- Processing Dependency: postgresql = 8.4.2 for package: 
postgresql-contrib

-- Restarting Dependency Resolution with new changes.
-- Populating transaction set with selected packages. Please wait.
--- Package postgresql-libs.i386 0:8.4.4-2PGDG.el4 set to be updated
--- Package postgresql-contrib.i386 0:8.4.4-2PGDG.el4 set to be updated
-- Running transaction check

Dependencies Resolved

=
 Package Arch   Version  Repository
Size

=
Updating:
 postgresql-server   i386   8.4.4-2PGDG.el4  pgdg84
4.5 M

Updating for dependencies:
 postgresql  i386   8.4.4-2PGDG.el4  pgdg84
1.3 M
 postgresql-contrib  i386   8.4.4-2PGDG.el4  pgdg84
376 k
 postgresql-libs i386   8.4.4-2PGDG.el4  pgdg84
179 k


Transaction Summary
=
Install  0 Package(s)
Update   4 Package(s)
Remove   0 Package(s)
Total download size: 6.4 M
Is this ok [y/N]: y
Downloading Packages:
warning: rpmts_HdrFromFdno: V3 DSA signature: NOKEY, key ID e8e345b8
Public key for postgresql-8.4.4-2PGDG.el4.i386.rpm is not installed
Retrieving GPG key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
GPG key at file:///etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG (0x442DF0F8) is 
already installed



The GPG keys listed for the PostgreSQL 8.4 4 - i386 repository are 
already installed but they are not correct for this package.

Check that the correct key URLs are configured for this repository.

# more /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
-BEGIN PGP PUBLIC KEY BLOCK-
Version: GnuPG v1.4.7 (GNU/Linux)

mQGiBEeD8koRBACC1VBRsUwGr9gxFFRho9kZpdRUjBJoPhkeOTvp9LzkdAQMFngr
BFi6N0ov1kCX7LLwBmDG+JPR7N+XcH9YR1coSHpLVg+JNy2kFDd4zAyWxJafjZ3a
9zFg9Yx+0va1BJ2t4zVcmKS4aOfbgQ5KwIOWUujalQW5Y+Fw39Gn86qjbwCg5dIo
tkM0l19h2sx50D027pV5aPsD/2c9pfcFTbMhB0CcKS836GH1qY+NCAdUwPs646ee
Ex/k9Uy4qMwhl3HuCGGGa+N6Plyon7V0TzZuRGp/1742dE8IO+I/KLy2L1d1Fxrn
XOTBZd8qe6nBwh12OMcKrsPBVBxn+iSkaG3ULsgOtx+HHLfa1/p22L5+GzGdxizr
peBuA/90cCp+lYcEwdYaRoFVR501yDOTmmzBc1DrsyWP79QMEGzMqa393G0VnqXt
L4pGmunq66Agw2EhPcIt3pDYiCmEt/obdVtSJH6BtmSDB/zYhbE8u3vLP3jfFDa9
KXxgtYj0NvuUVoRmxSKm8jtfmj1L7zoKNz3jl+Ba3L0WxIv4+bRBUG9zdGdyZVNR
TCBSUE0gQnVpbGRpbmcgUHJvamVjdCA8cGdzcWxycG1zLWhhY2tlcnNAcGdmb3Vu
ZHJ5Lm9yZz6IYAQTEQIAIAUCR4PySgIbIwYLCQgHAwIEFQIIAwQWAgMBAh4BAheA
AAoJEB8W0uFELfD4jnkAoMqd6ZwwsgYHZ3hP9vt+DJt1uDW7AKDbRwP8ESKFhwdJ
8m91RPBeJW/tMLkCDQRHg/JKEAgA64+ZXgcERPYfZYo4p+yMTJAAa9aqnE3U4Ni6
ZMB57GPuEy8NfbNya+HiftO8hoozmJdcI6XFyRBCDUVCdZ8SE+PJdOx2FFqZVIu6
dKnr8ykhgLpNNEFDG3boK9UfLj/5lYQ3Y550Iym1QKOgyrJYeAp6sZ+Nx2PavsP3
nMFCSD67BqAbcLCVQN7a2dAUXfEbfXJjPHXTbo1/kxtzE+KCRTLdXEbSEe3nHO04
K/EgTBjeBUOxnciH5RylJ2oGy/v4xr9ed7R1jJtshsDKMdWApwoLlCBJ63jg/4T/
z/OtXmu4AvmWaJxaTl7fPf2GqSqqb6jLCrQAH7AIhXr9V0zPZwADBQgAlpptNQHl
u7euIdIujFwwcxyQGfee6BG+3zaNSEHMVQMuc6bxuvYmgM9r7aki/b0YMfjJBk8v
OJ3Eh1vDH/woJi2iJ13vQ21ot+1JP3fMd6NPR8/qEeDnmVXu7QAtlkmSKI9Rdnjz
FFSUJrQPHnKsH4V4uvAM+njwYD+VFiwlBPTKNeL8cdBb4tPN2cdVJzoAp57wkZAN
VA2tKxNsTJKBi8wukaLWX8+yPHiWCNWItvyB4WCEp/rZKG4A868NM5sZQMAabpLd
l4fTiGu68OYgK9qUPZvhEAL2C1jPDVHPkLm+ZsD+90Pe66w9vB00cxXuHLzm8Pad
GaCXCY8h3xi6VIhJBBgRAgAJBQJHg/JKAhsMAAoJEB8W0uFELfD4K4cAoJ4yug8y
1U0cZEiF5W25HDzMTtaDAKCaM1m3Cbd+AZ0NGWNg/VvIX9MsPA==
=au6K
-END PGP PUBLIC KEY BLOCK-






so where do I go from here??!?




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Some Problems - Shall I reinstall the DB?

2011-02-01 Thread vwu98034
In regarding of log message, I get the following error message on my 
application for the connection problem and there is NOT any error messages in 
the PG log file. It is the same with DbVisualizer - not error messages in the 
PG log file.
~~~
Caused by: org.postgresql.util.PSQLException: The connection attempt failed.
at 
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:150)
at 
org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66)
at 
org.postgresql.jdbc2.AbstractJdbc2Connection.init(AbstractJdbc2Connection.java:125)
at 
org.postgresql.jdbc3.AbstractJdbc3Connection.init(AbstractJdbc3Connection.java:30)
at 
org.postgresql.jdbc3g.AbstractJdbc3gConnection.init(AbstractJdbc3gConnection.java:22)
at 
org.postgresql.jdbc4.AbstractJdbc4Connection.init(AbstractJdbc4Connection.java:30)
at org.postgresql.jdbc4.Jdbc4Connection.init(Jdbc4Connection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:393)
at org.postgresql.Driver.connect(Driver.java:267)
at 
org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
at 
org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:294)
at 
org.apache.commons.dbcp.BasicDataSource.validateConnectionFactory(BasicDataSource.java:1247)
at 
org.apache.commons.dbcp.BasicDataSource.createDataSource(BasicDataSource.java:1221)
... 52 more
Caused by: java.io.EOFException
at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:261)
at 
org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:269)
at 
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:108)
... 64 more
~~~

When I try to access PG from PC command prompt with the psql command, I get the 
following error message:

psql: server closed the connection unexpectedly 
   This probably mean the server terminated abnormally before or while 
processing the request.

This time a error message shows up in the PG log file: FATAL:  could not 
reattach to shared memory (key=5432001, addr=011D): Invalid argument

As I can tell, the PG log file doesn't tell the whole story. The version of PG 
is 8.0.


- Original Message -
From: Adrian Klaver adrian.kla...@gmail.com
To: vwu98...@lycos.com
Cc: pgsql-general pgsql-general@postgresql.org
Sent: 2011年1月31日 星期一 下午01时30分09秒 GMT -08:00 US/Canada Pacific
Subject: Re: [GENERAL] Some Problems - Shall I reinstall the DB?

On 01/31/2011 01:18 PM, vwu98...@lycos.com wrote:
 The DB connection problem has nothing to do with Hibernate. As I have said 
 that the problem can be solved by restarting the DB.
This problem occurs with my application as well as DdVisualizer.

So know we know it is a generic connection problem. Is there information 
in the Postgres logs that might be helpful?


 And as I have mentioned, the id error doesn't occur to other table.

Again is there any information generated in the logs when you try to 
connect to that table?  Another thing to check is whether the max id in 
the table is greater then the last value for the sequence associated 
with that table. Could be id(s) got inserted without using the sequence.



At this time reinstalling the DB would seem to be excessive. There is no 
guarantee that the problem won't just appear again with out knowing what 
is in the first place.

-- 
Adrian Klaver
adrian.kla...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search parser's treatment of URLs and emails

2011-02-01 Thread Bruce Momjian

I have added this as a TODO:

* Improve handling of plus signs in email address user
  names, and perhaps improve URL parsing

* http://archives.postgresql.org/pgsql-hackers/2010-10/msg00772.php 

---

Thom Brown wrote:
 Hi,
 
 I noticed that if I run this:
 
 SELECT alias, description, token FROM
 ts_debug('http://www.postgresql.org:2345/directory/page.html?version=9.1build=alpha1#summary');
 
 I get:
 
   alias   |  description  |  token
 --+---+-
  protocol | Protocol head | http://
  url  | URL   |
 www.postgresql.org:2345/directory/page.html?version=9.1build=alpha1#summary
  host | Host  | www.postgresql.org:2345
  url_path | URL path  |
 /directory/page.html?version=9.1build=alpha1#summary
 (4 rows)
 
 
 It could be me being picky, but I don't regard parameters or page
 fragments as part of the URL path.  Ideally, I'd sort of expect:
 
 alias |  description  |  token
 --+---+-
  protocol | Protocol head | http://
  url  | URL   |
 www.postgresql.org:2345/directory/page.html?version=9.1build=alpha1#summary
  host | Host  | www.postgresql.org
  port | Port  | 2345
  url_path | URL path  | /directory/page.html
  query_string | Query string  | version=9.1build=alpha1
  fragment | Page fragment | summary
 (7 rows)
 
 ... of course that's if there was support for query strings and page
 fragments, which there isn't.  But if changes were made to support my
 definition of a URL path, they'd have to be considered breaking
 changes.
 
 But my main gripe is with the name url_path.
 
 Also:
 
 SELECT alias, description, token FROM ts_debug('myname+prior...@gmail.com');
 
 Yields:
 
alias   |   description   |   token
 ---+-+
  asciiword | Word, all ASCII | myname
  blank | Space symbols   | +
  email | Email address   | prior...@gmail.com
 (3 rows)
 
 The entire string I entered is a valid email address, and isn't
 totally uncommon.  Shouldn't that take such email address styles be
 taken into account?  The example above incorrectly identifies the
 email address since the real destination address would most likely be
 myn...@gmail.com.
 
 -- 
 Thom Brown
 Twitter: @darkixion
 IRC (freenode): dark_ixion
 Registered Linux user: #516935
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Book recommendation?

2011-02-01 Thread Greg Smith

Herouth Maoz wrote:

My sysadmin ran into the following book:

PostgreSQL 9.0 High Performance, by Gregory Smith, ISBN 184951030X
http://amzn.com/184951030X
  


That guy's a troublemaker, but I guess he writes OK.

There are three customer reviews at 
http://www.amazon.com/PostgreSQL-High-Performance-Gregory-Smith/product-reviews/184951030X/


and two longer ones at:

http://people.planetpostgresql.org/andrew/index.php?/archives/130-Buy-this-book,-now..html
http://www.postgresonline.com/journal/archives/192-postgresql9highperformance.html

And here's some free samples:

https://www.packtpub.com/article/postgresql-9-reliable-controller-disk-setup
https://www.packtpub.com/article/postgresql-9-balancing-hardware-spending
https://www.packtpub.com/article/server-configuration-tuning-postgresql
https://www.packtpub.com/article/unix-monitoring-tool-for-postgresql
https://www.packtpub.com/article/postgresql-tips-tricks
https://www.packtpub.com/sites/default/files/0301OS-Chapter-2-Database-Hardware.pdf

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] yum repo problem

2011-02-01 Thread John R Pierce

On 02/01/11 12:07 PM, John R Pierce wrote:

warning: rpmts_HdrFromFdno: V3 DSA signature: NOKEY, key ID e8e345b8
Public key for postgresql-8.4.4-2PGDG.el4.i386.rpm is not installed
Retrieving GPG key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
GPG key at file:///etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG (0x442DF0F8) is 
already installed



...


k, I did a yum clean all   and its working now.

odd.  before it was trying to install...
postgresql-server   i386   8.4.4-2PGDG.el4  pgdg84

and now its installed...
postgresql-server   i386   8.4.6-1PGDG.rhel4  pgdg84


not sure i understand, but not sure it matters, problem solved.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Issues with generate_series using integer boundaries

2011-02-01 Thread Thom Brown
On 1 February 2011 01:05, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown t...@linux.com writes:
 I've noticed that if I try to use generate_series to include the upper
 boundary of int4, it never returns:

 I'll bet it's testing currval  bound without considering the
 possibility that incrementing currval caused an overflow wraparound.
 We fixed a similar problem years ago in plpgsql FOR-loops...

Yes, you're right.  Internally, the current value is checked against
the finish.  If it hasn't yet passed it, the current value is
increased by the step.  When it reaches the upper bound, since it
hasn't yet exceeded the finish, it proceeds to increment it again,
resulting in the iterator wrapping past the upper bound to become the
lower bound.  This then keeps it looping from the lower bound upward,
so the current value stays well below the end.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Some Problems - Shall I reinstall the DB?

2011-02-01 Thread Adrian Klaver
On Tuesday, February 01, 2011 12:03:20 pm vwu98...@lycos.com wrote:

 When I try to access PG from PC command prompt with the psql command, I get
 the following error message:
 
 psql: server closed the connection unexpectedly
This probably mean the server terminated abnormally before or while
 processing the request.
 
 This time a error message shows up in the PG log file: FATAL:  could not
 reattach to shared memory (key=5432001, addr=011D): Invalid argument
 
 As I can tell, the PG log file doesn't tell the whole story. The version of
 PG is 8.0.
 


I am going to go out on a limb here and say you are running the server on 
Windows correct? Before we go any further the 8.0.x series is past end of life, 
especially for the Windows version. The problem you report above FATAL:... is a 
known Windows version problem. It was fixed in later versions, looks like from  
8.2.18 on. I do not run Postgres on Windows, if that indeed is where your 
server 
is running, so I am at the limit of what I can help you with.

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] cast problem in Postgresql 9.0.1

2011-02-01 Thread Alban Hertroys
On 1 Feb 2011, at 7:14, AI Rumman wrote:

 I migrated my DB from Postgresql 8.1 to Postgresql 9.0.1.

That's quite a big step up. You skipped 8.2, 8.3 and 8.4 - all major releases. 
My advise: Test very thoroughly for more differences in behaviour.

One thing to start looking at right away is whether your config parameters 
still make sense, in case you're re-using your old config. Some changed names I 
think, and there are some new ones you might want to change.

 I have a table testtab
 \d testtab
 id int,
 hours varchar

That seems an odd choice for a datatype. What are you trying to accomplish by 
making it varchar?

And no, of course you can't sum varchars, what kind of output would you expect 
from that? That an older version of Postgres didn't throw an error was probably 
a bug.

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4d48774c11731638385336!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Some Problems - Shall I reinstall the DB?

2011-02-01 Thread Tom Lane
vwu98...@lycos.com writes:
 When I try to access PG from PC command prompt with the psql command, I get 
 the following error message:

 psql: server closed the connection unexpectedly 
This probably mean the server terminated abnormally before or while 
 processing the request.

 This time a error message shows up in the PG log file: FATAL:  could not 
 reattach to shared memory (key=5432001, addr=011D): Invalid argument

If you're running PG on Windows, this is a known problem that was fixed
some time ago ...

 As I can tell, the PG log file doesn't tell the whole story. The version of 
 PG is 8.0.

... but not *that* long ago.  8.0.x on Windows was desupported more than
three years ago, and as of last year is out of support on all platforms.
Please update to a current version.

http://www.postgresql.org/about/news.865

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with encode () and hmac() in pgcrypto

2011-02-01 Thread Marko Kreen
On Tue, Feb 1, 2011 at 5:36 PM, hlcborg h.luis.card...@gmail.com wrote:
 encode(hmac(v_em_crt_conc, v_Private,'sha1'),'base64');

HMAC - key-dependant SHA1

 The Result:

 h6CpmrP1QCE/Mp3xn3utUEPtftg=      This hash has 28 chars

 When I use OpenSSL in command line like this:

 ~$ echo 2011-01-31;2011-02-01T13:33:38;100036;684.40;  | openssl dgst
 -sha1 -sign abc.pem | openssl enc -base64 -A

Plain SHA1, which is signed with RSA signature.

 The Result is:

 nKfxnt31+kk/RnKihJ0jKufq+nZvmPjVauGo8+tqJ1Y/ah/mAu4jSS1wnzU+wRygZ4CLIV9DGSs9bxBc4r9e71C8s9B5ms6Kpggmc12kdmqVHBRO28bPWb/YLCej59gZFFkvcCudweNAT4qHvVqWsOtFCf9kE4q92UIv1JcwSDU=

 This hash has 172 chars

 Does someone know where is my problem? Is there other way to implement?  or
 is it simply impossible??

These two operations are not equivalent.

-- 
marko

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] cast problem in Postgresql 9.0.1

2011-02-01 Thread Joshua D. Drake
On Tue, 2011-02-01 at 12:14 +0600, AI Rumman wrote:
 I migrated my DB from Postgresql 8.1 to Postgresql 9.0.1.
 
 I have a table testtab
 \d testtab
 id int,
 hours varchar
 
 When I execute the following:
 select sum(hours) from testtab
 I get cast error.
 
In 8.3, implicit casts were removed. You can't sum text. You need to
change the data type to a proper numerical type.

Joshua D. Drake

 
-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] troubleshooting Postgresql on Windows

2011-02-01 Thread Heine Ferreira
Hi

I recently read some of your windows docs on the wiki.
They mentioned that some people experience problems with Postgresql and
firewalls and/or anti-virus programs on windows.
I know that a software firewall is usually not a problem. I have used a
couple of different software firewalls at different times
and they usually prompt you to ask if you want to allow this program to
access the internet.
Hardware firewalls are a different story however. That's why I rather
disabled my adsl router's firewall and I am using a software
firewall. As far as anti virus programs go according to the wiki you can
disable scanning of certain folders or executable files.
What sort of problems do firewalls and anti virus programs cause?
Do they cause the installation of Postgres to crash?
How will you know if a problem is caused by a firewall and/or anti virus
program?
Will Postgres work as an embedded database on windows?
Won't I have lots of support calls with users not being able to install
Postgres on Windows?
Is it possible to write my own installation routine for Postgres under
windows?
Where would I find such info?
I think I did read somewhere on the web that the one click installer does
allow you to do an unattened installation?

Thanks

H.F.


Re: [GENERAL] Book recommendation?

2011-02-01 Thread Alban Hertroys
On 1 Feb 2011, at 21:15, Greg Smith wrote:

 Herouth Maoz wrote:
 My sysadmin ran into the following book:
 
 PostgreSQL 9.0 High Performance, by Gregory Smith, ISBN 184951030X
 http://amzn.com/184951030X
  
 
 That guy's a troublemaker, but I guess he writes OK.


It must be tough to be reminded of him every time you look into a mirror.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4d487b6011733122416624!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] cast problem in Postgresql 9.0.1

2011-02-01 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes:
 On Tue, 2011-02-01 at 12:14 +0600, AI Rumman wrote:
 I migrated my DB from Postgresql 8.1 to Postgresql 9.0.1.
 
 I have a table testtab
 \d testtab
 id int,
 hours varchar
 
 When I execute the following:
 select sum(hours) from testtab
 I get cast error.
 
 In 8.3, implicit casts were removed. You can't sum text.

You couldn't do it in previous releases, either.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Issues with generate_series using integer boundaries

2011-02-01 Thread Alban Hertroys
On 1 Feb 2011, at 21:26, Thom Brown wrote:

 On 1 February 2011 01:05, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown t...@linux.com writes:
 I've noticed that if I try to use generate_series to include the upper
 boundary of int4, it never returns:
 
 I'll bet it's testing currval  bound without considering the
 possibility that incrementing currval caused an overflow wraparound.
 We fixed a similar problem years ago in plpgsql FOR-loops...
 
 Yes, you're right.  Internally, the current value is checked against
 the finish.  If it hasn't yet passed it, the current value is
 increased by the step.  When it reaches the upper bound, since it
 hasn't yet exceeded the finish, it proceeds to increment it again,
 resulting in the iterator wrapping past the upper bound to become the
 lower bound.  This then keeps it looping from the lower bound upward,
 so the current value stays well below the end.


That could actually be used as a feature to create a repeating series. A bit 
more control would be useful though :P

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4d487c1211731974314558!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] zero_damaged_pages doesn't work

2011-02-01 Thread Bruce Momjian
David Boreham wrote:
   On 9/27/2010 4:53 PM, Tom Lane wrote:
  The reason it tells you that data will be destroyed is that that could
  very well happen.
 
 Re-parsing this, I think there was a mis-communication :
 
 I'm not at all suggesting that the doc should _not_ say that data will 
 be corrupted.
 I'm suggesting that in addition to what it currently says, it also 
 should say that the on-disk data won't be
 changed by the page zeroing mode.
 
 In my searching I found countless people over the past few years who had
 been similarly confused into believing that it would write back the 
 zeroed page
 to disk.

Based on this discussion from September, I have applied the attached
documentation patch to clarify that zero_damaged_pages are not forced to
disk, and when to set this parameter off again.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3a0f755..141430c 100644
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*** LOG:  CleanUpLock: deleting: lock(0xb7ac
*** 6059,6073 
 para
  Detection of a damaged page header normally causes
  productnamePostgreSQL/ to report an error, aborting the current
! command.  Setting varnamezero_damaged_pages/ to on causes
! the system to instead report a warning, zero out the damaged page,
! and continue processing.  This behavior emphasiswill destroy data/,
! namely all the rows on the damaged page.  But it allows you to get
  past the error and retrieve rows from any undamaged pages that might
! be present in the table.  So it is useful for recovering data if
  corruption has occurred due to a hardware or software error.  You should
  generally not set this on until you have given up hope of recovering
! data from the damaged pages of a table.  The
  default setting is literaloff/, and it can only be changed
  by a superuser.
 /para
--- 6059,6075 
 para
  Detection of a damaged page header normally causes
  productnamePostgreSQL/ to report an error, aborting the current
! transaction.  Setting varnamezero_damaged_pages/ to on causes
! the system to instead report a warning, zero out the damaged
! page in memory, and continue processing.  This behavior emphasiswill destroy data/,
! namely all the rows on the damaged page.  However, it does allow you to get
  past the error and retrieve rows from any undamaged pages that might
! be present in the table.  It is useful for recovering data if
  corruption has occurred due to a hardware or software error.  You should
  generally not set this on until you have given up hope of recovering
! data from the damaged pages of a table.  Zerod-out pages are not
! forced to disk so it is recommended to recreate the table or
! the index before turning this parameter off again.  The
  default setting is literaloff/, and it can only be changed
  by a superuser.
 /para

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Windows to Linux PostgreSQL Migration

2011-02-01 Thread David Johnston
Hello,

 

I have just done a multi-part migration and am having difficulty viewing
binary content stored in a bytea field; specifically I stored a PDF in the
existing setup and now receive a File does not begin with '%PDF-' message
when I go to pull it up from the new setup.

 

The specific migration is as follows:

 

Existing:  Windows 2000 (32bit); PostgreSQL 8.2

New: Ubuntu 10.10 (64bit); PostgreSQL 9.0

 

To migrate the data I did the following:

 

Pg_dump (v 8.3) from the existing Windows (32) Machine into a dump file on
an NTFS file system.

 

Pg_restore (v 9.0) from a different Windows (64) machine from the dump file
into the running Linux (64) server.

 

Using the same application software and JRE (Tomcat container) I attempted
to view the files as normal (using Firefox on a Windows Machine) and the
File does not begin with '%PDF-' message appears.

 

I can think of some possible variations and causes for this but figured
before I go running for the cliff I'd see if anyone can at least point me in
the right direction.

 

Thank You

 

David Johnston

 

 



Re: [GENERAL] Windows to Linux PostgreSQL Migration

2011-02-01 Thread Andy Colson

On 2/1/2011 3:44 PM, David Johnston wrote:

Hello,

I have just done a multi-part migration and am having difficulty viewing
binary content stored in a “bytea” field; specifically I stored a PDF in
the existing setup and now receive a “File does not begin with ‘%PDF-‘”
message when I go to pull it up from the new setup.

The specific migration is as follows:

Existing: Windows 2000 (32bit); PostgreSQL 8.2

New: Ubuntu 10.10 (64bit); PostgreSQL 9.0

To migrate the data I did the following:

Pg_dump (v 8.3) from the existing Windows (32) Machine into a dump file
on an NTFS file system.

Pg_restore (v 9.0) from a different Windows (64) machine from the dump
file into the running Linux (64) server.

Using the same application software and JRE (Tomcat container) I
attempted to view the files as normal (using Firefox on a Windows
Machine) and the “File does not begin with ‘%PDF-‘” message appears.

I can think of some possible variations and causes for this but figured
before I go running for the cliff I’d see if anyone can at least point
me in the right direction.

Thank You

David Johnston



Not sure if it'll help, but you could try the pg_dump from version 9 to 
do the backup.


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Windows to Linux PostgreSQL Migration

2011-02-01 Thread David Johnston
Andy, thank you for the quick response but I figured out the problem.  I
recalled reading something regarding the bytea data type in 9.0
migration/release notes and decided to look into that.  It turns out that
changing the default output format to hex from escape is the cause of
the issue.  I restarted the server the bytea_output = escape and I am once
again able to view my PDF files.

The notes are very vague as to WHY this change but at least getting the
original behavior back isn't that difficult.  I am just curious whether (or
even how) I would change my Java code (JDBC) to use the newly default hex
format that appears to be preferred over the escape format.

Dave

-Original Message-
From: Andy Colson [mailto:a...@squeakycode.net] 
Sent: Tuesday, February 01, 2011 4:59 PM
To: David Johnston
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Windows to Linux PostgreSQL Migration

On 2/1/2011 3:44 PM, David Johnston wrote:
 Hello,

 I have just done a multi-part migration and am having difficulty 
 viewing binary content stored in a bytea field; specifically I 
 stored a PDF in the existing setup and now receive a File does not begin
with '%PDF-'
 message when I go to pull it up from the new setup.

 The specific migration is as follows:

 Existing: Windows 2000 (32bit); PostgreSQL 8.2

 New: Ubuntu 10.10 (64bit); PostgreSQL 9.0

 To migrate the data I did the following:

 Pg_dump (v 8.3) from the existing Windows (32) Machine into a dump 
 file on an NTFS file system.

 Pg_restore (v 9.0) from a different Windows (64) machine from the dump 
 file into the running Linux (64) server.

 Using the same application software and JRE (Tomcat container) I 
 attempted to view the files as normal (using Firefox on a Windows
 Machine) and the File does not begin with '%PDF-' message appears.

 I can think of some possible variations and causes for this but 
 figured before I go running for the cliff I'd see if anyone can at 
 least point me in the right direction.

 Thank You

 David Johnston


Not sure if it'll help, but you could try the pg_dump from version 9 to do
the backup.

-Andy


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Documentation enhancement

2011-02-01 Thread Bruce Momjian
Thomas Kellerer wrote:
 Hi,
 
 I would like to suggest to enhance the documentation of the CREATE VIEW
 statement.
 
 I think the fact that a SELECT * is internally stored as the expanded
 column list (valid at the time when the view was created) should be
 documented together with the CREATE VIEW statement. Especially because
 the example does use SELECT * to create the view.

Agreed.  The attached, applied patch documents this behavior.

--
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/ref/create_view.sgml b/doc/src/sgml/ref/create_view.sgml
index dd15507..417f8c3 100644
*** a/doc/src/sgml/ref/create_view.sgml
--- b/doc/src/sgml/ref/create_view.sgml
*** CREATE VIEW comedies AS
*** 173,178 
--- 173,182 
  FROM films
  WHERE kind = 'Comedy';
  /programlisting
+This will create a view containing the columns that are in the
+literalfilm/ table at the time of view creation.  Though
+literal*/ was used to create the view, columns added later to
+the table will not be part of the view.
/para
   /refsect1
  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Issues with generate_series using integer boundaries

2011-02-01 Thread Thom Brown
On 1 February 2011 21:32, Alban Hertroys
dal...@solfertje.student.utwente.nl wrote:
 On 1 Feb 2011, at 21:26, Thom Brown wrote:

 On 1 February 2011 01:05, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown t...@linux.com writes:
 I've noticed that if I try to use generate_series to include the upper
 boundary of int4, it never returns:

 I'll bet it's testing currval  bound without considering the
 possibility that incrementing currval caused an overflow wraparound.
 We fixed a similar problem years ago in plpgsql FOR-loops...

 Yes, you're right.  Internally, the current value is checked against
 the finish.  If it hasn't yet passed it, the current value is
 increased by the step.  When it reaches the upper bound, since it
 hasn't yet exceeded the finish, it proceeds to increment it again,
 resulting in the iterator wrapping past the upper bound to become the
 lower bound.  This then keeps it looping from the lower bound upward,
 so the current value stays well below the end.


 That could actually be used as a feature to create a repeating series. A bit 
 more control would be useful though :P

I don't quite understand why the code works.  As I see it, it always
returns a set with values 1 higher than the corresponding result.  So
requesting 1 to 5 actually returns 2 to 6 internally, but somehow it
correctly shows 1 to 5 in the query output.  If there were no such
discrepancy, the upper-bound/lower-bound problem wouldn't exist, so
not sure how those output values result in the correct query result
values.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with encode () and hmac() in pgcrypto

2011-02-01 Thread hlcborg

Hi Marko,

OK, 

 These two operations are not equivalent. 

But...
Can I have this operation done in the Stored Procedure inside the Database?
Plain SHA1, which is signed with RSA signature. and in the end encoded to
base64?

I was looking in the pgcrypto functions, and I haven´t found any that I
could use I think... Maybe I am not looking for the rigth one.
I need to use Plain SHA1 signed with a RSA signature and encoded into base64 
due to a new law related to digital signature in invoices in my country.

--
Luis


-- 
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Problem-with-encode-and-hmac-in-pgcrypto-tp3366420p3367159.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] yum repo problem

2011-02-01 Thread Devrim GÜNDÜZ
On Tue, 2011-02-01 at 12:25 -0800, John R Pierce wrote:
 k, I did a yum clean all   and its working now.
 
 odd.  before it was trying to install...
 postgresql-server   i386   8.4.4-2PGDG.el4  pgdg84
 
 and now its installed...
 postgresql-server   i386   8.4.6-1PGDG.rhel4  pgdg84
 
 
 not sure i understand, but not sure it matters, problem solved.

I have seen similar issues with old versions of yum, like you are using
now :(

Glad that it is solved now.
-- 
Devrim GÜNDÜZ
EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Select for update with offset interferes with concurrent transactions

2011-02-01 Thread David Johnston
If random sampling is desirable would the following construct limit locking
only to the sampled rows?

SELECT id 
FROM tasktable
WHERE id IN (SELECT random_id_sample())
FOR UPDATE

The random_id_sample would supply a configurable group of IDs off of
tasktable which the FOR UPDATE would then lock

I guess the issue remains that random_id_sample() would still end up
blocking if any of the rows it wants to return are already locked.

I too am using this basic protocol of maintaining state info within the
database and sending every query against it.  As I ponder this more it
really seems as if moving some of this logic into the application layer
would possibly make more sense in Yngve's situation (or at least something
to consider).  Continue to use the database as a persistence mechanism but
code the dispatching of tasks in the application layer and then as each
task is dispatched you simply do an UPDATE table SET state = 'dispatch'
WHERE id = 'ID' and a similar UPDATE when the task is returned completed.
This somewhat presumes you still only ever hand off one task at a time.  If
you are indeed handing off tasks in batches then it would make sense to have
a batch table and operate at the batch level instead of individual tasks -
assigning tasks to a given batch via some standard mechanism.  

Either way if you truly want true parallel processing then you need to
create the parallel paths that can operate without clobbering each other and
thus each path needs to have its own pool of tasks since as soon as you have
a shared resource the only true way to make sure it is only allocated once
is to serialize access to it.  An alternative method would be to allow
multiple dispatches but have a write-once method that is called and sets
an immutable handler_id and then when the processing begins only the handler
with the matching id would be able allow to perform the actual processing.

I say the above with certainty but at the moment I am using and fairly happy
with my limited serialization - especially since I have specific
sub-properties that I can use to limit how many records are locked AND also
because the locking time is very short (I cap around 20 or so active tasks
to dispatch - and only infrequently at that) so my experience and insight to
high-demand situations is limited.

Dave


-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Tuesday, February 01, 2011 12:18 PM
To: Yngve Nysaeter Pettersen
Cc: pgsql-general@postgresql.org
Subject: Re: Select for update with offset interferes with concurrent
transactions 

Yngve Nysaeter Pettersen yn...@opera.com writes:
 To avoid having the processes trample each other's queries (the first 
 attempt was to select the first matching entries of the table, which 
 caused one to block all other transactions), one of the steps I took 
 was to select a set of idle rows at a random offset into the table 
 from the project, mark them for update, then update each record's state as
started.

SELECT record_id FROM queue WHERE project_id = my_project AND state 
 = idle LIMIT n OFFSET i FOR UPDATE

 At present n is 100-150, i is a random value in the range 0-1.

 There is, intentionally, no ordering specified, since that would just 
 slow down the query, and is not necessary.

This seems like a pretty bad design.  There are recognized ways to solve
this problem with more predictability and much less chance of different
processes blocking each other.  In particular, this query seems be based on
some untenable assumptions about the physical row order being stable.

 What I've discovered when using Postgres 9.0 is that the processes are 
 now blocking every other query into this table,

In 9.0, LIMIT/OFFSET processing is done after FOR UPDATE locking, which
means that rows skipped over by OFFSET still get locked, which means that
different sessions executing this query are now practically certain to block
each other, rather than just likely to block each other.
This was an intentional change to improve the predictability of FOR UPDATE's
interactions with LIMIT/OFFSET, and indeed it's improved the predictability
of the behavior for you, just not in the direction you'd like :-(

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Importing/Appending to Existing Table

2011-02-01 Thread Rich Shepard

  I have an existing table with 15,492 rows and want to add additional rows
from a .csv file. If I use 'COPY tablename from filename with delimiter
as : csv quote as ' ' will this overwrite existing rows in the table or
append rows?

Rich

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does my DB size differ between Production and DR? (Postgres 8.4)

2011-02-01 Thread Aleksey Tsalolikhin
On Mon, Jan 31, 2011 at 7:52 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Mon, Jan 31, 2011 at 5:54 PM, Aleksey Tsalolikhin
 atsaloli.t...@gmail.com wrote:
 Situation:  Disk usage on production server root filesystem is at 68%
 utilization (80 GB used), on DR is at 51% (56 GB used).   We use
 SlonyII-1.2.x to keep the DR up to date.  I would like to account for
 the 24 GB difference.

 This is likely free space in your database.  Some of it is completely
 normal and actually improves performance.  Too much and your db is
 bloated and things starting taking too long.

Thanks, Scott!

Bucardo's check_postgres.pl --action bloat complains about one table,
1 GB wasted.  So the other tables must be OK.

So what about my DR, which doesn't have this same 20+ GB of free space.
Will it acquire it once it goes into production?  Will performance be impacted
as it acquires the free space?  Should I even be concerned about the
difference in disk usage or is it normal and expected?

How do I find out how much actual data I have in my database, minus the
free space?  Is there some built-in way to report this, or do I need to run
SELECT * FROM * and look at the byte count of the output?

Thanks,
-at

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does my DB size differ between Production and DR? (Postgres 8.4)

2011-02-01 Thread Peter Geoghegan
On 1 February 2011 03:52, Scott Marlowe scott.marl...@gmail.com wrote:
 You can reclaim that space by doing a cluster or vacuum full on the
 subject table.

Yes, but this is a fairly bad idea, particularly prior to PG 9.0 . 9.0
has a new vacuum full implementation that makes it not so bad - it
just rewrites the entire table.

VACUUM FULL will take exclusive locks on tables being vacuumed. It
also causes index bloat. You should be very careful about using it on
a production system.

I'm not sure why you'd advocate CLUSTER as a way to reclaim disk space.

I wouldn't increase index fill factor as an optimisation, unless you
had the unusual situation of having very static data in the table.

-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does my DB size differ between Production and DR? (Postgres 8.4)

2011-02-01 Thread Scott Marlowe
On Tue, Feb 1, 2011 at 8:13 PM, Peter Geoghegan
peter.geoghega...@gmail.com wrote:
 On 1 February 2011 03:52, Scott Marlowe scott.marl...@gmail.com wrote:
 You can reclaim that space by doing a cluster or vacuum full on the
 subject table.

 Yes, but this is a fairly bad idea, particularly prior to PG 9.0 . 9.0
 has a new vacuum full implementation that makes it not so bad - it
 just rewrites the entire table.

 VACUUM FULL will take exclusive locks on tables being vacuumed. It
 also causes index bloat. You should be very careful about using it on
 a production system.

I know these things.  I'm pretty sure it's even in the docs by now.

 I'm not sure why you'd advocate CLUSTER as a way to reclaim disk space.

Because it can reclaim disk space?

http://www.postgresql.org/docs/8.4/static/sql-cluster.html

 I wouldn't increase index fill factor as an optimisation, unless you
 had the unusual situation of having very static data in the table.

That makes no sense whatsoever.  You decrease fill factor (not
increase btw) so there will be some space for future updates.  If he's
getting bloat it may well help quite a bit to have a lower than 100%
fill factor.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does my DB size differ between Production and DR? (Postgres 8.4)

2011-02-01 Thread Scott Marlowe
More info here.

http://www.postgresql.org/docs/8.4/static/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does my DB size differ between Production and DR? (Postgres 8.4)

2011-02-01 Thread Scott Marlowe
On Tue, Feb 1, 2011 at 7:29 PM, Aleksey Tsalolikhin
atsaloli.t...@gmail.com wrote:
 Bucardo's check_postgres.pl --action bloat complains about one table,
 1 GB wasted.  So the other tables must be OK.

Pretty sure that unless you give it more args, the default for bloat
check is to list the first bloated table and stop.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] postgresql 9 on windows 2000

2011-02-01 Thread geek.shrek
Hi,

Does postgresql run on windows 2000?

I've tried to install postgresql-9.0.3-1 on windows 2000 but I couldn't connect 
to the default local server.

If not, which version can I use for windows 2000?


Thanks,


  

[GENERAL] Tablespaces Issue in Postgres

2011-02-01 Thread Adarsh Sharma

Dear all,

Few days ago I attached a tablespace to one my databases. Say 
/hdd-1/database_name is the path for the tablespace. When I attach it 
has a free space of 100 GB.


Now my database size grows to 90 GB. I want to prepare for my upcoming 
problem of space.

How can I handle this. Would anyone Please guide me the way to overcome.

Highly Thankful
Adarsh Sharma

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Importing/Appending to Existing Table

2011-02-01 Thread Sergey Konoplev
On 2 February 2011 03:28, Rich Shepard rshep...@appl-ecosys.com wrote:
  I have an existing table with 15,492 rows and want to add additional rows
 from a .csv file. If I use 'COPY tablename from filename with delimiter
 as : csv quote as ' ' will this overwrite existing rows in the table or
 append rows?

No it wont overwrite, it will append rows.


 Rich

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com /
Linkedin: http://ru.linkedin.com/in/grayhemp /
JID/GTalk: gray...@gmail.com / Skype: gray-hemp

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does my DB size differ between Production and DR? (Postgres 8.4)

2011-02-01 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 Bucardo's check_postgres.pl --action bloat complains about one table,
 1 GB wasted. So the other tables must be OK.

 Pretty sure that unless you give it more args, the default for bloat
 check is to list the first bloated table and stop.

No, it will show all tables over the given threshhold. However, the 
statement the other tables must be OK is definitely not a given, 
as the bloat calculation used by check_postgres is a very rough one.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201102020206
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAk1JAqkACgkQvJuQZxSWSsiH0ACfZowR8lU2PJByBCyhsELpdozg
3SIAnjguAyRbjXxT8cSo6yZ8zar00TNZ
=ji8B
-END PGP SIGNATURE-



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general