Re: [GENERAL] PostgreSQL 8.4.8 bringing my website down every evening

2011-05-27 Thread Albe Laurenz
Scott Marlowe wrote:
 Then just use pid or something that can uniquely identify the queries
 when they're running.

I recommend %c in log_line_prefix.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_connections proposal

2011-05-27 Thread Cédric Villemain
2011/5/27 Tom Lane t...@sss.pgh.pa.us:
 Craig Ringer cr...@postnewspapers.com.au writes:
 On 05/26/2011 09:48 PM, Tom Lane wrote:
 Craig Ringercr...@postnewspapers.com.au  writes:
 max_connections = 100                   # (change requires restart)
 # WARNING: If you're about to increase max_connections above 100, you
 # should probably be using a connection pool instead. See:
 #     http://wiki.postgresql.org/max_connections

 This gives the impression that performance is great at 100 and falls off
 a cliff at 101, which is both incorrect and likely to lower peoples'
 opinion of the software.

 Fair call; the use of a specific value is misleading.

 I'd suggest wording more like if you're
 considering raising max_connections into the thousands, you should
 probably use a connection pool instead.

 Best performance is often obtained with the number of _active_
 connections in the 10s to 30s on commonplace hardware. I'd want to use
 hundreds - because mailing list posts etc suggest that people start
 running into problems under load at the 400-500 mark, and more
 importantly because it's well worth moving to pooling _way_ before that
 point.

 OK, maybe word it as If you're considering raising max_connections much
 above 100, ... ?

Be aware that a too large value can be counter-productive and a
connection pooler can be more appropriate.

No scale... I am really happy to face more and more servers where
'top'  truncate the list of processors... We will have to scale and
not make that limitation a feature, imho.

-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_connections proposal

2011-05-27 Thread Stuart Bishop
On Fri, May 27, 2011 at 6:22 AM, Craig Ringer
cr...@postnewspapers.com.au wrote:

 Best performance is often obtained with the number of _active_ connections
 in the 10s to 30s on commonplace hardware. I'd want to use hundreds -
 because mailing list posts etc suggest that people start running into
 problems under load at the 400-500 mark, and more importantly because it's
 well worth moving to pooling _way_ before that point.

If you can. I'd love a connection pool that knows when I have a
resource that persists across transactions like a cursor or temporary
table and the backend connection needs to be maintained between
transactions, or if there are no such resources and the backend
connection can be released to the pool between transactions. I suspect
this sort of pool would need to be built into the core. At the moment
I only see a benefit with a pool from connections from my webapp which
I know can safely go through pgbouncer in transaction pooling mode.

Or would there be some way of detecting if the current session has
access to stuff that persists across transactions and this feature
could be added to the existing connection pools?


-- 
Stuart Bishop stu...@stuartbishop.net
http://www.stuartbishop.net/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature request: Replicate only parts of a database

2011-05-27 Thread Peter Geoghegan
On 27 May 2011 05:53, A B gentosa...@gmail.com wrote:

 I have a small problem, in a database I need to export parts of a
 database table to another server, and I could easily accomplish  ( I
 hope)  that by creating a view and select * from the view and send it
 over to the other server or use triggers to record what rows are
 inserted and deleted. (I have not written that part yet)

 With the new very nice streaming replication, I think it would be much
 better if the database could send the data directly to the other
 server instead of having to write my own sync-script.
 But I don't want to sync the entire database since only a very small
 fraction of the data should be replicated.

That isn't going to happen, because of the way streaming replication
is implemented. Besides, you haven't even described what interface
such a feature would actually have. How would it know which tuples to
replicate? Is that really a more sensible interface than just being
able to selectively replicate a subset of tables?

Consider a trigger-based replication system like Slony or Londiste,
which allow asynchronous replication at table granularity.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Feature request: Replicate only parts of a database

2011-05-27 Thread Andrew Sullivan
On Fri, May 27, 2011 at 06:53:17AM +0200, A B wrote:

 hope)  that by creating a view and select * from the view and send it
 over to the other server or use triggers to record what rows are
 inserted and deleted. (I have not written that part yet)

You could do this today with Slony or Londiste or any of the other
trigger-based replication systems.  
 
A
-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there any problem with pg_notify and memory consumption?

2011-05-27 Thread Tom Lane
Per-Olov Esgard per-olov.esg...@micronic-mydata.com writes:
 This is my setup:
 - linux kernel 2.6.20.16 in a custom OS installation based on Fedora Core 
 3, yes I know, stone age :-)
 - memory 512 MB
 - swap 512 MB
 - cpu p4 2.8 GHz - single core - no hyper threading
 - db encoding UTF-8
 - client encoding default Latin-1 (Linux client) but UTF-8 used for 
 Windows clients 
 - postgresql.conf is attached as well as the environment variables
 - we  build the postgres server ourselves and the dev env is the same as 
 the target env  (no cross compilation).

Hah, I replicated the problem.  Here is what's going on: the main loop
in PostgresMain is intended to do its work in MessageContext.  But if
ProcessCompletedNotifies does any work, it exits with
CurrentMemoryContext having been reset to TopMemoryContext during
transaction commit.  This means any memory that the main loop doesn't
bother to explicitly free during the next command cycle will get leaked.
The reason we haven't noticed this seems to be that the only case where
any such memory does get leaked is if we have to do encoding conversion
on the incoming command.  Also, the bug has only been there since 9.0.

I think the right fix is to make sure that ProcessCompletedNotifies
saves and restores the call-time CurrentMemoryContext.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] syntax for updating an aliased table

2011-05-27 Thread Andy Chambers
On Thu, May 26, 2011 at 1:40 PM, Rick Genter rick.gen...@gmail.com wrote:

 The UPDATE statement when multiple tables are involved always drives me nuts.
 I think what you need to do is remove all of the old. from the SET clause 
 and use triple. in the WHERE clause instead of old. - and remove the old 
 table alias from the UPDATE.

This worked.  Thanks very much

--
Andy

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] copy record?

2011-05-27 Thread Merlin Moncure
On Thu, May 26, 2011 at 3:23 PM, Gauthier, Dave dave.gauth...@intel.com wrote:
 Hi:



 From within a perl/DBI script, I want to be able to make a copy of a record
 in a table, changing only the value of the primary key.  I don't know in
 advance what all the columns are, just the table name.



 I suppose I could pull the column names for the table from the metadata,
 query the table/record for the values to copy, build an insert statement
 from all of that and run it.  But is there a simpler, more elegant way to do
 this?

there's a very easy way using the composite type method as long as you
know which field(s) are the primary key -- especially if it's say the
first column and an integer.

postgres=# \d foo
  Table public.foo
 Column |  Type   | Modifiers
+-+---
 a  | integer |
 b  | integer |
Indexes:
foo_a_b_idx btree (a, b)

postgres=# select foo from foo limit 1;
  foo
---
 (1,1)
(1 row)

change 1 - 2 textually, cast the text back to the composite and pass it back in

insert into foo select (($$(2,1)$$::foo).*);

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there any problem with pg_notify and memory consumption?

2011-05-27 Thread Simon Riggs
On Fri, May 27, 2011 at 4:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Hah, I replicated the problem.  Here is what's going on: the main loop
 in PostgresMain is intended to do its work in MessageContext.  But if
 ProcessCompletedNotifies does any work, it exits with
 CurrentMemoryContext having been reset to TopMemoryContext during
 transaction commit.  This means any memory that the main loop doesn't
 bother to explicitly free during the next command cycle will get leaked.
 The reason we haven't noticed this seems to be that the only case where
 any such memory does get leaked is if we have to do encoding conversion
 on the incoming command.  Also, the bug has only been there since 9.0.

 I think the right fix is to make sure that ProcessCompletedNotifies
 saves and restores the call-time CurrentMemoryContext.

Can we put a WARNING in there if we try to commit while in TopMemoryContext?

That way we'll trap any future leaks in core/add-on code.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there any problem with pg_notify and memory consumption?

2011-05-27 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Fri, May 27, 2011 at 4:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I think the right fix is to make sure that ProcessCompletedNotifies
 saves and restores the call-time CurrentMemoryContext.

 Can we put a WARNING in there if we try to commit while in TopMemoryContext?

That has nothing to do with it ...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] adding applications to the stack builder

2011-05-27 Thread Gaetano Giunta

Hello

I would like to know what is the process to get new applications accepted for 
inclusion in the stack builder (namely the eZ Publish cms).

I would be ready spend some time to package the application according to some specific format, and to host the built package on some dedicated server if there 
is need - but the only thing I've found so far is a project in pgfoundry that looks abandonware (not a lot of activity since 2007...)


thanks
Gaetano


Re: [GENERAL] Is there any problem with pg_notify and memory consumption?

2011-05-27 Thread Per-Olov Esgard
Thank you very much for your fast answers (both Tom and Merlin), I really appreciate it.
Do I have to send a proper bug report for this?
We have this problem in our product now, so I volunteer to test a patch :-)

Best regards,
Per-Olov Esgård-Tom Lane t...@sss.pgh.pa.us wrote: - 

To: Per-Olov Esgard per-olov.esg...@micronic-mydata.comFrom: Tom Lane t...@sss.pgh.pa.usDate: 05/27/2011 05:28PMCc: Merlin Moncure mmonc...@gmail.com, pgsql-general@postgresql.orgSubject: Re: [GENERAL] Is there any problem with pg_notify and memory consumption?Per-Olov Esgard per-olov.esg...@micronic-mydata.com writes: This is my setup: - linux kernel 2.6.20.16 in a custom OS installation based on Fedora Core  3, yes I know, stone age :-) - memory 512 MB - swap 512 MB - cpu p4 2.8 GHz - single core - no hyper threading - db encoding UTF-8 - client encoding default Latin-1 (Linux client) but UTF-8 used for  Windows clients  - postgresql.conf is attached as well as the environment variables - we build the postgres server ourselves and the dev env is the same as  the target env (no cross compilation).Hah, I replicated the problem. Here is what's going on: the main loopin PostgresMain is intended to do its work in MessageContext. But ifProcessCompletedNotifies does any work, it exits withCurrentMemoryContext having been reset to TopMemoryContext duringtransaction commit. This means any memory that the main loop doesn'tbother to explicitly free during the next command cycle will get leaked.The reason we haven't noticed this seems to be that the only case whereany such memory does get leaked is if we have to do encoding conversionon the incoming command. Also, the bug has only been there since 9.0.I think the right fix is to make sure that ProcessCompletedNotifiessaves and restores the call-time CurrentMemoryContext.regards, tom lane
The information contained in this communication and any attachments may be confidential and privileged, and is for the sole use of the intended recipient(s). If you are not the intended recipient, you are hereby formally notified that any unauthorized review, use, disclosure or distribution of this message is prohibited. Please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Micronic Mydata is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.

Re: [GENERAL] Is there any problem with pg_notify and memory consumption?

2011-05-27 Thread Merlin Moncure
On Fri, May 27, 2011 at 11:12 AM, Per-Olov Esgard
per-olov.esg...@micronic-mydata.com wrote:
 Thank you very much for your fast answers (both Tom and Merlin), I really
 appreciate it.
 Do I have to send a proper bug report for this?
 We have this problem in our product now, so I volunteer to test a patch  :-)

download REL9_0_STABLE, compile, build, install, and test.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there any problem with pg_notify and memory consumption?

2011-05-27 Thread Tom Lane
I wrote:
 I think the right fix is to make sure that ProcessCompletedNotifies
 saves and restores the call-time CurrentMemoryContext.

The patch committed here appears to fix it for me:
http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=722548e4309c28631ada292fe6cad04ae8f9c151

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] SELECT COUNT(*) execution time on large tables (v9.0.4-1)

2011-05-27 Thread Carlos Sotto Maior (SIM)
Hi,

My application has a frequent need to issue a select count(*) on tables.
Some have a large row count. (The example below are from a 5.7 M row; Some
are larger).

Issuing  either  SELECT COUNT(*)  or SELECT COUNT(Primary_Key_Colum) yelds
a sequential scan on table;

I have browsed catalog tables, digging for a real time Row.count but  so far
did not find any.
QUESTION: Is there a better (faster) way to obtain the row count from a
table?  


Table has been VACCUM ANALYZED prior EXPLAIN ANALYZE  (Numbers are from a
test server)


explain analyze select count(*) as qtd from ut_mailing_client ;
Aggregate  (cost=1231424.23..1231424.24 rows=1 width=0) (actual
time=7205.009..7205.010 rows=1 loops=1)
  -  Seq Scan on ut_mailing_client  (cost=100.00..1217054.58
rows=5747858 width=0) (actual time=0.034..3866.765 rows=5747311 loops=1)
Total runtime: 7205.071 ms


explain analyze select count(utm_id) as qtd from ut_mailing_client ;
Aggregate  (cost=1231424.23..1231424.24 rows=1 width=4) (actual
time=7984.382..7984.382 rows=1 loops=1)
  -  Seq Scan on ut_mailing_client  (cost=100.00..1217054.58
rows=5747858 width=4) (actual time=0.028..4114.177 rows=5747311 loops=1)
Total runtime: 7984.443 ms

--
explain analyze select count(beneficio) as qtd from ut_mailing_client ;
Aggregate  (cost=1231424.23..1231424.24 rows=1 width=11) (actual
time=7591.530..7591.530 rows=1 loops=1)
  -  Seq Scan on ut_mailing_client  (cost=100.00..1217054.58
rows=5747858 width=11) (actual time=0.032..3845.412 rows=5747311 loops=1)
Total runtime: 7591.595 ms

--TABLE
STRUCTURE---
-
CREATE TABLE ut_mailing_client
(
  utm_id serial NOT NULL,
  beneficio character varying(10) NOT NULL,
.
.
.
  CONSTRAINT ut_mailing_client_pkey PRIMARY KEY (beneficio),
  CONSTRAINT ut_mailing_client_utm_id_key UNIQUE (utm_id)
)
WITH (
  OIDS=FALSE
);

-VACUM
ANALYZE-
-
INFO:  vacuuming public.ut_mailing_client
INFO:  index ut_mailing_client_pkey now contains 5747311 row versions in
18926 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 6.50s/3.24u sec elapsed 39.03 sec.
INFO:  index ut_mailing_client_utm_id_key now contains 5747311 row
versions in 12615 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 4.28s/2.19u sec elapsed 26.05 sec.
INFO:  index ut_mailing_client_utm_fk_lote_utm_dt_used_idx now contains
5747311 row versions in 18926 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 6.39s/3.27u sec elapsed 38.90 sec.
INFO:  ut_mailing_client: found 0 removable, 1179 nonremovable row
versions in 31 out of 159576 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 17.17s/8.71u sec elapsed 104.02 sec.
INFO:  vacuuming pg_toast.pg_toast_69799
INFO:  index pg_toast_69799_index now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  pg_toast_69799: found 0 removable, 0 nonremovable row versions in 0
out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing public.ut_mailing_client
INFO:  ut_mailing_client: scanned 3 of 159576 pages, containing
1080857 live rows and 0 dead rows; 3 rows in sample, 5749295 estimated
total rows
Total query runtime: 111560 ms.

Carlos Sotto Maior
+55 11 8244-7899
cso...@sistemassim.com.br

Sistemas Sim Serviços e Tecnologia Ltda.
+55 11 5041-3086
Rua Tenente Gomes Ribeiro, 78
Vila Clementino (Próximo ao Metro Santa Cruz)
São Paulo - SP 
04038-040



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: 500KB PDF files for postgresql8.3, which is the most efficient way?

2011-05-27 Thread Emi Lu

Hello,

I'd like to have more comments about the following case:


. 500KB per PDF file; 30 files per year
. PSQL8.3

  . struts2.2.3 + mybatis for sql operation
  . tomcat6

Added more info

 Solution:
 
 (1) Save pdfs to file system, only point file name in psql8.3

 (2) Save oids of pdfs into table

 (3) Save pdf files as bytea column in psql8.3


Pros and cons for (1), (2), (3), which is the most efficient way?

Thanks a lot!
Emi


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] 500KB PDF files for postgresql8.3, which is the most efficient way?

2011-05-27 Thread Emi Lu

Hello,

I'd like to have more comments about the following case:

. 500KB per PDF file; 30 files per year
. PSQL8.3

Solution:

(1) Save pdfs to file system, only point file name in psql8.3

(2) Save oids of pdfs into table

(3) Save pdf files as bytea column in psql8.3

Pros and cons for (1), (2), (3), which is the most efficient way?

Thanks a lot!
Emi





On 05/27/2011 12:45 AM, Jasen Betts wrote:

On 2011-05-26, Bosco Ramapostg...@boscorama.com  wrote:


   select * into temp table foo from maintable where primcol=123;
   update foo set primcol = 456;
   insert into maintable select * from foo;

You also may need this is if you intend to use the same sequence of
calls on within the same session:

   drop table foo;


Yet another way to do the same thing:

begin;

create temportary table foo on commit drop as
  select * from maintable where primcol=123;
update foo, set primcol=456;
insert into maintable select * from foo;

commit;



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Inspecting a DB - psql or system tables ?

2011-05-27 Thread Andre Majorel
Suppose you want to write a program that connects to a
PostgreSQL database and lists its tables and views, the type of
their columns and their relationships (REFERENCES) for automatic
joins.

Would you look at the system tables (pg_class et al.) or the
output of psql \d, \dt, etc ?

While parsing the output of psql is cumbersome, accessing the
system tables seems more likely to break whenever a new version
of PostgreSQL comes out.

-- 
André Majorel http://www.teaser.fr/~amajorel/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Inspecting a DB - psql or system tables ?

2011-05-27 Thread fork
Andre Majorel aym-2lqsgp at teaser.fr writes:

 
 Suppose you want to write a program that connects to a
 PostgreSQL database and lists its tables and views, the type of
 their columns and their relationships (REFERENCES) for automatic
 joins.

I personally would try to hit the informat_schema, which is a bunch of views
on system activity that follows a standard. I am not sure if what you want is in
there, like locks or other stuff, but for table names etc it works well.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Inspecting a DB - psql or system tables ?

2011-05-27 Thread Tomas Vondra
Dne 27.5.2011 19:24, Andre Majorel napsal(a):
 Suppose you want to write a program that connects to a
 PostgreSQL database and lists its tables and views, the type of
 their columns and their relationships (REFERENCES) for automatic
 joins.
 
 Would you look at the system tables (pg_class et al.) or the
 output of psql \d, \dt, etc ?

System tables (old-fashioned pg_ catalogs or information_schema). Psql
reads those catalogs anyway, so parsing the output seems like an
unnecessary step.

 While parsing the output of psql is cumbersome, accessing the
 system tables seems more likely to break whenever a new version
 of PostgreSQL comes out.

Really? Those catalogs are pretty stable, and when changed they're
usually extended (new columns are added). So well written queries won't
break very often. Actually I'd expect the psql output to change much
more often.

regards
Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Inspecting a DB - psql or system tables ?

2011-05-27 Thread Andrew Sullivan
On Fri, May 27, 2011 at 08:26:33PM +0200, Tomas Vondra wrote:
  While parsing the output of psql is cumbersome, accessing the
  system tables seems more likely to break whenever a new version
  of PostgreSQL comes out.
 
 Really? Those catalogs are pretty stable, and when changed they're
 usually extended (new columns are added). So well written queries won't
 break very often. Actually I'd expect the psql output to change much
 more often.

The whole point of the information_schema is that it's well-defined by
the standard.  The system tables themselves do sometimes change
between versions -- that's why you get warnings from psql when you
start up a client with a different major version number than the
server.  (If you want to see this in action, try using a 7.4-era
client with 9.0, and do some tab completion or something like that.)

A

-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] trigger - dynamic WHERE clause

2011-05-27 Thread Tarlika Elisabeth Schmitz
Hello Pavel,
Thanks for taking the time to reply.

On Fri, 27 May 2011 09:12:20 +0200
Pavel Stehule pavel.steh...@gmail.com wrote:

Hello

2011/5/26 Tarlika Elisabeth Schmitz postgres...@numerixtechnology.de:
 On Sun, 22 May 2011 20:39:01 +0200
 Pavel Stehule pavel.steh...@gmail.com wrote:

2011/5/22 Tarlika Elisabeth Schmitz
postgres...@numerixtechnology.de:
 EXECUTE 'SELECT 1 FROM ' || TG_TABLE_NAME || ' WHERE ' ||
 whereclause || ' FOR UPDATE;';

 I am generating the whereclause dynamically as the number of
 columns queried varies.

 Am I right in assuming that I cannot use EXECUTE ... USING in this
 scenario?


why not? You can use it - just USING has a fixed numbers of
parameters, so you should to use a arrays.

 Currently, I am producing the whereclause on a subset of columns:

 SELECT  array_to_string (array(
  SELECT  newrecord.key || ' = ' ||  quote_literal(newrecord.value)
  FROM (SELECT (each(hstore(NEW))).*) AS newrecord
 WHERE newrecord.key LIKE 'id%' ), ' AND ')
 INTO whereclause;

 That gives me, for example:
 SELECT 1 FROM test WHERE id1 = '26' AND id2 = 'name2' FOR UPDATE;

 In an attempt to use EXECUTE '...' USING, I tried to execute
 SELECT 1 FROM test WHERE id1 = $1 AND id2 = $2 FOR UPDATE;

 I produced an array of corresponding values:
 [...]

 EXECUTE '...' USING av

 == ERROR: operator does not exist: integer = text[]


I am not sure, if I understand well to your goals. 

I am trying to write a generic INSERT trigger, which checks whether the
NEW record already exists. In the simplified example above, columns
called id* are PK columns and they might be of different type.

The showed problem is in wrong using a array
[...]
Clause USING doesn't do a array unpacking

you should to generate little bit different dynamic statement
EXECUTE 'SELECT .. WHERE a = $1[1] AND b = $1[2]' USING ARRAY[...]

I changed that but this wasn't my only problem; typecasting was the
second issue. Column id1 is INT4 and the value obtained from NEW via
each(hstore(NEW))) converted to TEXT.

I can fix this by explicit typecasting:
'... WHERE id1 = $1[1]::int4 ...'



But there's a few things I'd be interested to understand:

1) My original version quoted all values regardless of type. I presume
this worked with integers because there's some implicit typecasting
going on?

2) I took from your blog entry
(http://okbob.blogspot.com/2008/06/execute-using-feature-in-postgresql-84.html)
that it is good practice to use EXECUTE USING. 
Well, there's no danger of SQL injection as this particular DB runs on
an internal network. However, I am wondering whether EXECUTE USING has
a performance advantage?


-- 

Best Regards,
Tarlika Elisabeth Schmitz

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: [SQL] Re: 500KB PDF files for postgresql8.3, which is the most efficient way?

2011-05-27 Thread Samuel Gendler
On Fri, May 27, 2011 at 9:15 AM, Emi Lu em...@encs.concordia.ca wrote:


  Solution:
  
  (1) Save pdfs to file system, only point file name in psql8.3

  (2) Save oids of pdfs into table

  (3) Save pdf files as bytea column in psql8.3


 Pros and cons for (1), (2), (3), which is the most efficient way?


You asked about the most efficient but you didn't tell us what you are
doing with the pdf files.  It really doesn't matter how you store it if you
are simply inserting a 500KB object 30 times per year - then it becomes all
about convenience.  I'd probably rule out solution #1 in that case purely
from a maintenance perspective.  Storing the PDFs in the db eliminates the
maintenance work of ensuring that the files are archived along with the db
backups, that file paths stay correct whenever a new db host is provisioned,
and storing the files in the db provides transaction-safe file storage.  If,
on the other hand, you are selecting/updating the rows in question several
hundred times per second or more, then you may well want to put some thought
into efficiency and, assuming some kind of web access, providing your
webserver/caching layer with direct access to the files on the filesystem
for service efficiency.  But if you are only occasionally accessing the
files in question, there's little reason not to put them in the db.

If mybatis can't load the object that references the file without also
loading the entire file from the db - and you are using that object for
other things that will require frequent loading/storing - then you probably
don't want the files in the db.  If it can load the object without loading
the binary data, by lazily loading that column only when requested, then it
doesn't matter.  Using BLOBs guarantees that you can access the entire row
without loading the binary data if mybatis exposes the blob separately from
the rest of the object, but mybatis may be capable of issuing a select
without that column and then grabbing that column as needed in order to
simulate that in the case of a bytea column, anyway.  Internally, a large
bytea column is treated similarly to a blob, with any data over a certain
size not stored in-line with the rest of the row for efficiency reasons.


[GENERAL] Shared Buffer Size

2011-05-27 Thread preetika tyagi
Hi All,

I am little confused about the internal working of PostgreSQL. There is a
parameter shared_buffer in postgres.conf and I am assuming that it is used
for buffer management in PostgreSQL. If there is a need to bring in a new
page in the buffer and size exceeds the shared_buffer limit, a victim dirty
page will be written back to the disk.

However, I have read on many links that PostgreSQL depends on the OS for
caching. (http://www.varlena.com/GeneralBits/Tidbits/perf.html#shbuf)

So my question is, the actual limit of the shared buffer will be defined by
OS or the shared_buffer parameter in the postgres.conf to figure whether a
victim dirty page needs to be selected for disk write or not?

Thanks!


Re: [GENERAL] SELECT COUNT(*) execution time on large tables (v9.0.4-1)

2011-05-27 Thread David Johnston
Counting live data is inherently imprecise.  There are supposedly some
system tables that can give you rough numbers.

You would be better off figuring out an alternative method to get the data
you desire and stop continually recounting all 5.7M records. 

A Trigger driven counter, for insert and delete, is probably the most
obvious method.  Also, say for temporal data, cache the prior monthly counts
and only perform an actual count over the current (changing) month(s).

At your table size the brute-force approach is obviously not going to work
so an alternative method needs to be devised, one that eliminates
re-counting previously counted records.  The specific design is going to be
highly dependent on your specific requirements - which is why no generalized
solution exists.  If you provide the why behind the question, and not just
the question, people may be inclined to provide relevant suggestions. 

Issuing a count(*) is not a need - it is an implementation.  The need is
what you end up doing with that number.

Lastly, the time you spent combing the system catalogs would have been
better spent perusing the FAQ linked to off the PostgreSQL homepage.  You
question, in almost the same words, is in the FAQ with a link to the wiki
which repeats all your observations and explains why the behavior is that
way; and suggests (links to) possible alternatives.  You may wish to go
there now to get more background and ideas.

David J.


 Hi,
 
 My application has a frequent need to issue a select count(*) on tables.
 Some have a large row count. (The example below are from a 5.7 M row;
 Some are larger).
 
 Issuing  either  SELECT COUNT(*)  or SELECT COUNT(Primary_Key_Colum)
 yelds a sequential scan on table;
 
 I have browsed catalog tables, digging for a real time Row.count but  so
far
 did not find any.
 QUESTION: Is there a better (faster) way to obtain the row count from a
 table?
 



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Shared Buffer Size

2011-05-27 Thread Derrick Rice
Check out the Inside the PostgreSQL Buffer Cache link here:

http://projects.2ndquadrant.com/talks

Thanks to Greg Smith (active here).

Derrick

On Fri, May 27, 2011 at 3:36 PM, preetika tyagi preetikaty...@gmail.comwrote:

 Hi All,

 I am little confused about the internal working of PostgreSQL. There is a
 parameter shared_buffer in postgres.conf and I am assuming that it is used
 for buffer management in PostgreSQL. If there is a need to bring in a new
 page in the buffer and size exceeds the shared_buffer limit, a victim dirty
 page will be written back to the disk.

 However, I have read on many links that PostgreSQL depends on the OS for
 caching. (http://www.varlena.com/GeneralBits/Tidbits/perf.html#shbuf)

 So my question is, the actual limit of the shared buffer will be defined by
 OS or the shared_buffer parameter in the postgres.conf to figure whether a
 victim dirty page needs to be selected for disk write or not?

 Thanks!



[GENERAL] String library

2011-05-27 Thread Little, Douglas
I'm working a string parser.  Is there a user module that expands the set of 
string functions?



Doug Little

Sr. Data Warehouse Architect | Business Intelligence Architecture | Orbitz 
Worldwide
500 W. Madison, Suite 1000  Chicago IL 60661| Office 312.260.2588 | Fax 
312.894.5164 | Cell 847-997-5741
douglas.lit...@orbitz.commailto:douglas.lit...@orbitz.com
 [cid:image001.jpg@01CC1C89.1BA16100]   orbitz.comhttp://www.orbitz.com/ | 
ebookers.comhttp://www.ebookers.com/ | 
hotelclub.comhttp://www.hotelclub.com/ | 
cheaptickets.comhttp://www.cheaptickets.com/ | 
ratestogo.comhttp://www.ratestogo.com/ | 
asiahotels.comhttp://www.asiahotels.com/

inline: image001.jpg

Re: [GENERAL] Shared Buffer Size

2011-05-27 Thread preetika tyagi
Hi Derrick,

Thank you for your response.
I saw this document and trying to understand Interaction with the Operating
System Cache which is mentioned in this document.

I have the following question-
Hows does the shared buffer in Postgres rely on the Operating System cache?
Suppose my RAM is 8 GB and shared_buffer is 24 MB in postgres. And there are
some dirty pages in shared_buffer and I need to write a dirty page back to
the disk to bring in a new page. What happens in this case? The dirty page
will be written to the disk considering the shared_buffer size as 24 MB? or
it will not be written and will stay in RAM which is 8 GB?

Thanks,
Preetika

On Fri, May 27, 2011 at 2:11 PM, Derrick Rice derrick.r...@gmail.comwrote:

 Check out the Inside the PostgreSQL Buffer Cache link here:

 http://projects.2ndquadrant.com/talks

 Thanks to Greg Smith (active here).

 Derrick


 On Fri, May 27, 2011 at 3:36 PM, preetika tyagi 
 preetikaty...@gmail.comwrote:

 Hi All,

 I am little confused about the internal working of PostgreSQL. There is a
 parameter shared_buffer in postgres.conf and I am assuming that it is used
 for buffer management in PostgreSQL. If there is a need to bring in a new
 page in the buffer and size exceeds the shared_buffer limit, a victim dirty
 page will be written back to the disk.

 However, I have read on many links that PostgreSQL depends on the OS for
 caching. (http://www.varlena.com/GeneralBits/Tidbits/perf.html#shbuf)

 So my question is, the actual limit of the shared buffer will be defined
 by OS or the shared_buffer parameter in the postgres.conf to figure whether
 a victim dirty page needs to be selected for disk write or not?

 Thanks!