Re: [GENERAL] Relationship between ulimit and effective_cache_size, work_mem, shared_buffers, maintenance_work_mem and vacuum

2010-08-16 Thread Greg Smith

Sandeep Srinivasa wrote:


I was looking at ways to optimize the postgres database and yet limit 
the amount of memory that it can consume.


You can't usefully limit the amount of memory that PostgreSQL will 
consume yet.  Each client can easily allocate multiples of work_mem 
worth of memory as they run queries, and there's temp_buffers to be 
concerned about too.  One way to work around problems like this is to 
significantly limit the number of clients that can be running at a time, 
using something like a connection pooler, and then keep a low setting 
for max_connections.  This can have some positive performance benefits 
of its own, and it will keep the number of clients (the only real source 
of variable allocations on the server) from getting too large.  Software 
suitable for that purpose includes pgbouncer and pgpool.


Now, the effective cache size variable seems more of a hint to the 
query planner, than any hard limit on the database server.


That's correct.  It doesn't allocate anything.  Doesn't limit anything 
either.


Q1.  if I add ulimit -m and ulimit -v lines in my postgres upstart 
files  will that be good enough to hard-limit Postgres memory usage ?


After fighting a few random crashes where the server runs into ulimit, 
you will find that trying to hard limit PostgreSQL memory usage is more 
trouble than it's worth.  It's really a bad place to go.


Q2.  once I have decided my max memory allocation (call it MY_ULIMIT) 
- should effective cache size be set to MY_ULIMIT - 256 - 12 -20 ? 
 round it off to MY_ULIMIT - 512mb maybe


effective_cache_size has no relation to the limits.  Take a look at how 
much of your operating system cache you think might be free at any 
time.  Figure out what percentage of that you might want PostgreSQL to 
be able to use sometime.  Set effective_cache_size to it.  If a query 
goes wild and decides to execute a really bad query plan that reads a 
bunch of data, it will trash the operating system cache regardless; you 
can't stop it like this.


Q3. Or will doing something like this play havoc with the query 
planner/unexplained OOM/crashes ?


If you ever let the system get so low on RAM that the Linux OOM killer 
becomes active, it will almost always kill the main database server 
process, the one that spawns all the clients off, due to how Linux does 
shared memory accounting.  This is another really bad things to be avoided.


1. will this affect the memory usage of vacuum (going to be using 
default vacuum settings for 8.4) - because ideally I would want to 
have some control over it as well.


Each of the autovacuum processes (defaulting to 3) will use up to 
maintenance_work_mem worth of memory when they are running.  You should 
account for that when estimating peak usage.


2. Would I have to tune max_connections, max_files_per_process (and 
any related variables) ?


Limiting max_connections, and accordingly dealing with the fact that 
some connections might be refused temporarily in your application, is 
the most effective thing you can do here.  max_files_per_process is 
really secondary to any of the other bits you're asking about.


3. When I turn on WAL, would I have to tune wal_buffers accordingly 
 set effective cache size to account for wal_buffers as well ?


Set wal_buffers somewhere between 1MB and 16MB, include it in the 
general server shared memory overhead, and then ignore it.  It takes up 
a little memory but isn't nearly as important as these other bits.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] good exception handling archiecutre

2010-08-16 Thread Atul.Goel
Hi Alban,

Thanks for the reply.

I checked that Notify will not work till the transaction is commited. So in our 
case since the transaction is being rolled back there will be no notify. I am 
new to Postgres so i am not sure how to run LISTEN as demon so that it keeps on 
running to be notified.

Here is the extract form 
http://www.postgresql.org/docs/8.4/static/sql-notify.html
NOTIFY interacts with SQL transactions in some important ways. Firstly, if a 
NOTIFY is executed inside a transaction, the notify events are not delivered 
until and unless the transaction is committed. This is appropriate, since if 
the transaction is aborted, all the commands within it have had no effect, 
including NOTIFY.

Is there any other way of inter process communication in Postgres.

Regards,
Atul Goel

-Original Message-
From: Alban Hertroys [mailto:dal...@solfertje.student.utwente.nl]
Sent: 13 August 2010 17:50
To: Atul Goel
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] good exception handling archiecutre

On 13 Aug 2010, at 14:07, atul.g...@globaldatapoint.com 
atul.g...@globaldatapoint.com wrote:

 How can we do error logging in Postgres. I am trying to create a LOG_AND_STOP 
 method which would be generic to the code. I have pasted the code I have 
 written. This code returns no data which is understandable as the error is 
 thrown to the external world.

 Can I write the code somehow.

 CREATE SCHEMA test
   AUTHORIZATION postgres;

 CREATE TABLE test.error_log
 (
   error_desc text
 );


 CREATE OR REPLACE FUNCTION test.log_and_stop(err_desc text)
   RETURNS void AS
 $BODY$
 Declare
 x integer;
 begin
 begin
 insert into test.error_log values (err_desc);
 end;
 begin
 raise exception '%',err_desc;

Your transaction got rolled back here, hence the lack of data in your error_log 
table.

I don't think there's any way to log errors into a table from within the same 
transaction, you'll probably need to use a separate connection for the logging. 
I think LISTEN/NOTIFY may come in handy there.

 end;
 end;
 $BODY$
   LANGUAGE 'plpgsql' VOLATILE
   COST 100;
 ALTER FUNCTION test.log_and_stop(text) OWNER TO postgres;

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:1157,4c6577d2967638551718946!


This e-mail is confidential and should not be used by anyone who is not the 
original intended recipient. Global DataPoint Limited does not accept liability 
for any statements made which are clearly the sender's own and not expressly 
made on behalf of Global DataPoint Limited. No contracts may be concluded on 
behalf of Global DataPoint Limited by means of e-mail communication. Global 
DataPoint Limited Registered in England and Wales with registered number 
3739752 Registered Office Middlesex House, 34-42 Cleveland Street, London W1T 
4LB

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] InitDB: Bad system call

2010-08-16 Thread Torsten Zühlsdorff

Hello,


... The simplest explanation
I can think of is that it's *only* shmctl that is malfunctioning, not
the other SysV shared memory calls.  Which is even weirder, and
definitely seems to move the problem into the category of kernel bug
rather than configuration mistake.


Hmmm ... Google turned up the information that FreeBSD migrated from int
to size_t variables for shared memory size between 7.0 and 8.0, and in
particular that the size of the struct used by shmctl() changed in
8.0.  So I'm now wondering if what you're dealing with is some sort of
version skew problem.  Could it be that you built Postgres against
system header files that don't match your kernel version?  I'm not
exactly sure how that would manifest as this particular signal,
but it seems worth checking.


I have the correct header files, but that brings me to an interesting 
notice and a workaround.


Before i had build the new jail, i checked out the newest sources for 
FreeBSD 7.0 and recompile the world. With the new world i build the 
jail and the problems occurs.
Meanwhile there are two running jails with postgresql in at the same 
server. And IPC-problems seems unfamiliar to me, because the 
error-messages normally looks very different and other instances running 
without problems;)


What i've done now, was disableing an old jail and copy it to an new 
location. After some reconfiguration i use the copy as new jail and 
install postgresql. And it works.


That fortify your assumption, that the problem must lie in FreeBSD. But 
this will be hard to debug, because the last make world was 3 years 
ago of the machine. I will discribe the problem to the FreeBSD-Community.


Thanks for all your help and time,
Torsten

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] good exception handling archiecutre

2010-08-16 Thread Adrian von Bidder
Heyho!

[[ logging from postgres ]]

I think you can use dblink to do this - everything you do via dblink happens 
in a separate db session and so it shouldn't be affected by what happens in 
your transaction.

Depending on your requirements, system resources may be an issue though 
(you're opening a full additional db session.)

cheers
-- vbi


-- 
Although this syntax is required by the SQL standard, it's not clear
what the standard expects it to do, except that it's most likely not
what we were doing.
-- PostgreSQL 8.3 release notes


signature.asc
Description: This is a digitally signed message part.


[GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Matthew Wilson
I'm converting some procedural code to SQL as an experiment.  Here's the
pseudocode:

c = a - b
if c  0 then d = 'no'
else d = 'yes'

In SQL, I've got this:

select a, b, a - b as c,
case when a - b  0 then 'no'
else 'yes'
end as d

from foo;

This is a trivial example, but you can see how I calculate a - b two
separate times.

In reality, I have much nastier calculations and they happen more than
just twice.

I'm looking for an elegant solution for this puzzle.  I don't want to
repeat that a - b part over and over because I likely will need to
change how c gets defined and I don't want to have to change more than
one place in the code.

All I can come up with so far is to use a view and then another view on
top of that one:

create view v1 as
select a, b, a - b as c
from foo;

create view v2 as
select a, b, c,
case when c  0 then 'no'
else 'yes'
end as d
from v1;

This is better than the first solution because c is only defined in a
single place.  Is this the best possible solution?

Thanks for the help.

Matt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Tom Lane
Matthew Wilson m...@tplus1.com writes:
 All I can come up with so far is to use a view and then another view on
 top of that one:

Note that you don't actually need a view, as you can just write the
subselect in-line:

 select a, b, c,
 case when c  0 then 'no'
 else 'yes'
 end as d
 from (select a, b, a - b as c from foo) as v1;

This is the standard method for avoiding repeat calculations in SQL.

One thing to keep in mind is that the planner will usually try to
flatten a nested sub-select (and whether it was written out manually
or pulled from a view does not matter here).  This will result in the
sub-select's expressions getting inlined into the parent, so that the
calculations will actually get done more than once.  If you're trying
to reduce execution time not just manual labor, you may want to put an
offset 0 into the sub-select to create an optimization fence.  But
test whether that really saves anything --- if there are bigger joins
or additional WHERE conditions involved, you can easily lose more than
you gain by preventing flattening.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Eric Ndengang

Am 16.08.2010 14:45, schrieb Matthew Wilson:

I'm converting some procedural code to SQL as an experiment.  Here's the
pseudocode:

 c = a - b
 if c  0 then d = 'no'
 else d = 'yes'

In SQL, I've got this:

 select a, b, a - b as c,
 case when a - b  0 then 'no'
 else 'yes'
 end as d

 from foo;

This is a trivial example, but you can see how I calculate a - b two
separate times.

In reality, I have much nastier calculations and they happen more than
just twice.

I'm looking for an elegant solution for this puzzle.  I don't want to
repeat that a - b part over and over because I likely will need to
change how c gets defined and I don't want to have to change more than
one place in the code.

All I can come up with so far is to use a view and then another view on
top of that one:

 create view v1 as
 select a, b, a - b as c
 from foo;

 create view v2 as
 select a, b, c,
 case when c  0 then 'no'
 else 'yes'
 end as d
 from v1;

This is better than the first solution because c is only defined in a
single place.  Is this the best possible solution?

Thanks for the help.

Matt


   
You can also use the ' with Queries ' option to solve this Problem like 
this:


with table_1 as (select a,b, a-b as c from foo)
Select a,b, c,
case when c0 then 'no'
 else 'yes' end as d
 from table_1;
I hope , it will help you

--
Eric Ndengang
Junior Datenbankentwickler

Affinitas GmbH  |  Kohlfurter Straße 41/43  |  10999 Berlin  |  Germany
email: eric.ndengang_fo...@affinitas.de  | tel: +49.(0)30. 991 949 5 0  |  
www.edarling.de

Geschäftsführer: Lukas Brosseder, David Khalil, Kai Rieke, Christian Vollmann
Eingetragen beim Amtsgericht Berlin, HRB 115958


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Using record.items() method

2010-08-16 Thread Luís de Sousa
Hello everyone,

I'm trying to code a function that processes records from various tables
performing the same action in each case. The functions receives as an
argument a RECORD type variable and I'd like to iterate through each of its
elements. According to the documentation here:

http://python.projects.postgresql.org/pldocs/plpython3-postgres-types-record.html

there exists an items() method that returns a set of tuples each with the
attribute name and the associated value. How can use this method with
pl/pgsql?

Thank you,

Luís


Re: [GENERAL] Wrong ORDER BY on a numeric value result

2010-08-16 Thread Stefan Wild
The answer is quite simple: this statement was composed by several functions 
used for ordering on string values as also on numeric. 


--- Thomas Kellerer spam_ea...@gmx.net schrieb am So, 15.8.2010:

 Von: Thomas Kellerer spam_ea...@gmx.net
 Betreff: Re: [GENERAL] Wrong ORDER BY on a numeric value result
 An: pgsql-general@postgresql.org
 Datum: Sonntag, 15. August, 2010 10:43 Uhr
 Stefan Wild wrote on 15.08.2010
 10:36:
  column is numeric, but upper() works on text, and
 returns
  text, so your
  numeric column got casted to text by using upper
 (which is
  pointless
  anyway - there is no upper version of digits).
  remove upper() and you'll be fine.
 
 
  Thank you guys! That was the point.
 
 The real question is:
 
 what did you try to accomplish with the UPPER() on a
 numeric column?
 
 Regards
 Thomas
 
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general
 




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] good exception handling archiecutre

2010-08-16 Thread Alban Hertroys
On 16 Aug 2010, at 10:45, atul.g...@globaldatapoint.com 
atul.g...@globaldatapoint.com wrote:

 Hi Alban,
 
 Thanks for the reply.
 
 I checked that Notify will not work till the transaction is commited. So in 
 our case since the transaction is being rolled back there will be no notify.

Ah, you're right! I totally forgot about that...

 Is there any other way of inter process communication in Postgres.

Of course there is :)

There's nothing stopping you from writing a few custom functions in an 
untrusted language that does your IPC for you. C comes to mind, but PL/Perlu, 
PL/PythonU, etc. should be able to do it as well (and probably easier).

Usually if you're talking about performing I/O external to the database you'll 
get warned that transactions involving those calls can't be rolled back, but in 
this case that's exactly what you want.

Now I know that Postgres uses some IPC internally as well (at least shared 
memory), so maybe it's possible to use some of the internal mechanisms. I have 
no idea whether that's possible or at all advisable, I'm sure someone (probably 
Tom) will chime in regarding that.

Cheers!

 -Original Message-
 From: Alban Hertroys [mailto:dal...@solfertje.student.utwente.nl]
 Sent: 13 August 2010 17:50
 To: Atul Goel
 Cc: pgsql-general@postgresql.org
 Subject: Re: [GENERAL] good exception handling archiecutre
 
 On 13 Aug 2010, at 14:07, atul.g...@globaldatapoint.com 
 atul.g...@globaldatapoint.com wrote:
 
 How can we do error logging in Postgres. I am trying to create a 
 LOG_AND_STOP method which would be generic to the code. I have pasted the 
 code I have written. This code returns no data which is understandable as 
 the error is thrown to the external world.
 
 Can I write the code somehow.
 
 CREATE SCHEMA test
  AUTHORIZATION postgres;
 
 CREATE TABLE test.error_log
 (
  error_desc text
 );
 
 
 CREATE OR REPLACE FUNCTION test.log_and_stop(err_desc text)
  RETURNS void AS
 $BODY$
 Declare
 x integer;
 begin
 begin
 insert into test.error_log values (err_desc);
 end;
 begin
 raise exception '%',err_desc;
 
 Your transaction got rolled back here, hence the lack of data in your 
 error_log table.
 
 I don't think there's any way to log errors into a table from within the same 
 transaction, you'll probably need to use a separate connection for the 
 logging. I think LISTEN/NOTIFY may come in handy there.
 
 end;
 end;
 $BODY$
  LANGUAGE 'plpgsql' VOLATILE
  COST 100;
 ALTER FUNCTION test.log_and_stop(text) OWNER TO postgres;
 
 Alban Hertroys
 
 --
 If you can't see the forest for the trees,
 cut the trees and you'll see there is no forest.
 
 
 
 
 
 This e-mail is confidential and should not be used by anyone who is not the 
 original intended recipient. Global DataPoint Limited does not accept 
 liability for any statements made which are clearly the sender's own and not 
 expressly made on behalf of Global DataPoint Limited. No contracts may be 
 concluded on behalf of Global DataPoint Limited by means of e-mail 
 communication. Global DataPoint Limited Registered in England and Wales with 
 registered number 3739752 Registered Office Middlesex House, 34-42 Cleveland 
 Street, London W1T 4LB
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general
 

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4c697052967632122118857!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Encoding change question...

2010-08-16 Thread Karl Denninger
So I have myself a nice pickle here.

I've got a database which was originally created with SQL_ASCII for the
encoding (anything goes text fields)

Unfortunately, I have a bunch of data that was encoded in UTF-8 that's
in an RSS feed that I need to load into said database.  iconv barfs all
over this file in an attempt to turn it into ISO-8859 (which is what the
web application currently attached to that database is emitting and
collecting.)  It appears the problem is (mostly) things like the
stylized double-quotes.

So I figured I'd go the other way, and convert what I have now in the
tables into UTF8.

Well, except that doesn't work either.

ticker=# select convert_to(subject, 'utf8') from post where
ordinal='2098167';
   convert_to  

 1%: the interest rate on IBM\222s most recent three-year bond.
(1 row)

\222 is the correct code point for the styled single apostrophe that is
in that place in ISO-8859-1 in the source.  However, the UTF prefix is
missing, as are the other two code-point characters (that is, I got the
code point but not the other two bytes that should be in front of it. 
And if I set the code page on the web site to UTF-8, and also set the
encoding on the SQL session to UTF-8 I don't get the three-byte code - I
just get the one byte.  That's a bust.

There are TWO fields in this database that need converted.  I presumed I
could do something like this:

# update post set field1 = convert_to(field1, 'utf8');

It runs to completion without complaint but produces the above.  No good.

So is there a way to do this?  I do NOT want to dump, iconv on the
dumped file, then reload the database if I can possibly avoid it.  Yes,
I know that will work (I've tested it on my development box), but being
able to do this hot would be DRAMATICALLY preferred.  If I do the
iconv route I am basically rebuilding the entire database with all
that's involved in doing so in terms of downtime and such.  VERY
undesirable.

(Equally-valid would be a way to convert UTF-8 to ISO-8859-1 (Latin1),
thereby allowing me to convert the incoming data stream to what's
already in the system, but thus far I've found no joy on that at all.)

Ideas?

-- Karl

attachment: karl.vcf
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Encoding change question...

2010-08-16 Thread Peter C. Lai
The doublequotes isn't UTF8 it's people copying and pasting from Microsoft
stuff, which is WIN-1252. So try to use that with iconv instead of utf8

On 2010-08-16 12:40:03PM -0500, Karl Denninger wrote:
 So I have myself a nice pickle here.
 
 I've got a database which was originally created with SQL_ASCII for the
 encoding (anything goes text fields)
 
 Unfortunately, I have a bunch of data that was encoded in UTF-8 that's
 in an RSS feed that I need to load into said database.  iconv barfs all
 over this file in an attempt to turn it into ISO-8859 (which is what the
 web application currently attached to that database is emitting and
 collecting.)  It appears the problem is (mostly) things like the
 stylized double-quotes.
 
 So I figured I'd go the other way, and convert what I have now in the
 tables into UTF8.
 
 Well, except that doesn't work either.
 
 ticker=# select convert_to(subject, 'utf8') from post where
 ordinal='2098167';
convert_to  
 
  1%: the interest rate on IBM\222s most recent three-year bond.
 (1 row)
 
 \222 is the correct code point for the styled single apostrophe that is
 in that place in ISO-8859-1 in the source.  However, the UTF prefix is
 missing, as are the other two code-point characters (that is, I got the
 code point but not the other two bytes that should be in front of it. 
 And if I set the code page on the web site to UTF-8, and also set the
 encoding on the SQL session to UTF-8 I don't get the three-byte code - I
 just get the one byte.  That's a bust.
 
 There are TWO fields in this database that need converted.  I presumed I
 could do something like this:
 
 # update post set field1 = convert_to(field1, 'utf8');
 
 It runs to completion without complaint but produces the above.  No good.
 
 So is there a way to do this?  I do NOT want to dump, iconv on the
 dumped file, then reload the database if I can possibly avoid it.  Yes,
 I know that will work (I've tested it on my development box), but being
 able to do this hot would be DRAMATICALLY preferred.  If I do the
 iconv route I am basically rebuilding the entire database with all
 that's involved in doing so in terms of downtime and such.  VERY
 undesirable.
 
 (Equally-valid would be a way to convert UTF-8 to ISO-8859-1 (Latin1),
 thereby allowing me to convert the incoming data stream to what's
 already in the system, but thus far I've found no joy on that at all.)
 
 Ideas?
 
 -- Karl
 

 begin:vcard
 fn:Karl Denninger
 n:Denninger;Karl
 email;internet:k...@denninger.net
 x-mozilla-html:TRUE
 version:2.1
 end:vcard
 

 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general


-- 
===
Peter C. Lai | Bard College at Simon's Rock
Systems Administrator| 84 Alford Rd.
Information Technology Svcs. | Gt. Barrington, MA 01230 USA
peter AT simons-rock.edu | (413) 528-7428
===


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Matthew Wilson
On Mon Aug 16 10:26:36 2010, Tom Lane wrote:
 Matthew Wilson m...@tplus1.com writes:
 All I can come up with so far is to use a view and then another view on
 top of that one:

 Note that you don't actually need a view, as you can just write the
 subselect in-line:

  select a, b, c,
  case when c  0 then 'no'
  else 'yes'
  end as d
  from (select a, b, a - b as c from foo) as v1;

 This is the standard method for avoiding repeat calculations in SQL.

 One thing to keep in mind is that the planner will usually try to
 flatten a nested sub-select (and whether it was written out manually
 or pulled from a view does not matter here).  This will result in the
 sub-select's expressions getting inlined into the parent, so that the
 calculations will actually get done more than once.  If you're trying
 to reduce execution time not just manual labor, you may want to put an
 offset 0 into the sub-select to create an optimization fence.  But
 test whether that really saves anything --- if there are bigger joins
 or additional WHERE conditions involved, you can easily lose more than
 you gain by preventing flattening.

   regards, tom lane


Thanks so much for the help!

I don't care if the code is rearranged so that c is replaced with an
inline definition during compilation.  I'm not concerned about
efficiency here.  I just don't want to have to redefine it manually over
and over again, because I know that as I update how c is defined, I'll
forget to update it everywhere.

Maybe sql needs a preprocessing macro language like C.

ducks

Matt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Encoding change question...

2010-08-16 Thread Karl Denninger
Peter C. Lai wrote:
 The doublequotes isn't UTF8 it's people copying and pasting from Microsoft
 stuff, which is WIN-1252. So try to use that with iconv instead of utf8

 On 2010-08-16 12:40:03PM -0500, Karl Denninger wrote:
   
 So I have myself a nice pickle here.

 I've got a database which was originally created with SQL_ASCII for the
 encoding (anything goes text fields)

 Unfortunately, I have a bunch of data that was encoded in UTF-8 that's
 in an RSS feed that I need to load into said database.  iconv barfs all
 over this file in an attempt to turn it into ISO-8859 (which is what the
 web application currently attached to that database is emitting and
 collecting.)  It appears the problem is (mostly) things like the
 stylized double-quotes.

 So I figured I'd go the other way, and convert what I have now in the
 tables into UTF8.

 Well, except that doesn't work either.

 ticker=# select convert_to(subject, 'utf8') from post where
 ordinal='2098167';
convert_to  
 
  1%: the interest rate on IBM\222s most recent three-year bond.
 (1 row)

 \222 is the correct code point for the styled single apostrophe that is
 in that place in ISO-8859-1 in the source.  However, the UTF prefix is
 missing, as are the other two code-point characters (that is, I got the
 code point but not the other two bytes that should be in front of it. 
 And if I set the code page on the web site to UTF-8, and also set the
 encoding on the SQL session to UTF-8 I don't get the three-byte code - I
 just get the one byte.  That's a bust.

 There are TWO fields in this database that need converted.  I presumed I
 could do something like this:

 # update post set field1 = convert_to(field1, 'utf8');

 It runs to completion without complaint but produces the above.  No good.

 So is there a way to do this?  I do NOT want to dump, iconv on the
 dumped file, then reload the database if I can possibly avoid it.  Yes,
 I know that will work (I've tested it on my development box), but being
 able to do this hot would be DRAMATICALLY preferred.  If I do the
 iconv route I am basically rebuilding the entire database with all
 that's involved in doing so in terms of downtime and such.  VERY
 undesirable.

 (Equally-valid would be a way to convert UTF-8 to ISO-8859-1 (Latin1),
 thereby allowing me to convert the incoming data stream to what's
 already in the system, but thus far I've found no joy on that at all.)

 Ideas?

 -- Karl
 
No, the problem is that the existing data in the database and the web
app that are using it are both ISO-8859-1.

The issue is that the data I need to load INTO the system is in UTF-8
(and really is), and is full of three-byte escapes.  iconv barfs all
over it trying to go to ISO-8859-1, so I can't convert the INCOMING data
to what's in the system now.  It also won't convert it to Windows-1252
(which is kind of a superset of ISO-8859, and thus the server might not
toss on it too badly.)

FS/karl:~/tmp iconv -f UTF-8 -t WINDOWS-1252 rss-marketticker.php
/dev/null
iconv: (stdin):2766:6125: cannot convert

FS/karl:~/tmp iconv -f UTF-8 -t ISO-8859-1 rss-marketticker.php
/dev/null 
iconv: (stdin):222:391: cannot convert

The data set that is incoming is some ~30,000 records.  The data in the
system already is ~2m records.  Obviously, if I can convert the incoming
data that's the better choice, but it appears I can't.

Thus the desire to go the other way - turn the existing data in the
database into UTF-8, which is probably desirable in the future anyway.

I can iconv a pg_dumpall of the database TO UTF-8 and it succeeds (at
least in theory), but the database into which I want to load this data
set is HUGE and dumping and reloading it isn't on my A list of things
to do.

If I have to I have to - but if I can avoid this I certainly WANT TO.

What I don't understand is why the Postgres function convert_to()
doesn't emit the THREE BYTE sequence, and only emits the codepoint.


-- Karl
attachment: karl.vcf
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Boszormenyi Zoltan
Matthew Wilson írta:
 On Mon Aug 16 10:26:36 2010, Tom Lane wrote:
   
 Matthew Wilson m...@tplus1.com writes:
 
 All I can come up with so far is to use a view and then another view on
 top of that one:
   
 Note that you don't actually need a view, as you can just write the
 subselect in-line:

  select a, b, c,
  case when c  0 then 'no'
  else 'yes'
  end as d
  from (select a, b, a - b as c from foo) as v1;

 This is the standard method for avoiding repeat calculations in SQL.

 One thing to keep in mind is that the planner will usually try to
 flatten a nested sub-select (and whether it was written out manually
 or pulled from a view does not matter here).  This will result in the
 sub-select's expressions getting inlined into the parent, so that the
 calculations will actually get done more than once.  If you're trying
 to reduce execution time not just manual labor, you may want to put an
 offset 0 into the sub-select to create an optimization fence.  But
 test whether that really saves anything --- if there are bigger joins
 or additional WHERE conditions involved, you can easily lose more than
 you gain by preventing flattening.

  regards, tom lane

 

 Thanks so much for the help!

 I don't care if the code is rearranged so that c is replaced with an
 inline definition during compilation.  I'm not concerned about
 efficiency here.  I just don't want to have to redefine it manually over
 and over again, because I know that as I update how c is defined, I'll
 forget to update it everywhere.

 Maybe sql needs a preprocessing macro language like C.
   

Or maybe we can dust off my GENERATED column patch
I posted here in 2006. :-)

Best regards,
Zoltán Böszörményi


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Thom Brown
2010/8/16 Boszormenyi Zoltan z...@cybertec.at:
 Matthew Wilson írta:
 On Mon Aug 16 10:26:36 2010, Tom Lane wrote:

 Matthew Wilson m...@tplus1.com writes:

 All I can come up with so far is to use a view and then another view on
 top of that one:

 Note that you don't actually need a view, as you can just write the
 subselect in-line:

      select a, b, c,
      case when c  0 then 'no'
      else 'yes'
      end as d
      from (select a, b, a - b as c from foo) as v1;

 This is the standard method for avoiding repeat calculations in SQL.

 One thing to keep in mind is that the planner will usually try to
 flatten a nested sub-select (and whether it was written out manually
 or pulled from a view does not matter here).  This will result in the
 sub-select's expressions getting inlined into the parent, so that the
 calculations will actually get done more than once.  If you're trying
 to reduce execution time not just manual labor, you may want to put an
 offset 0 into the sub-select to create an optimization fence.  But
 test whether that really saves anything --- if there are bigger joins
 or additional WHERE conditions involved, you can easily lose more than
 you gain by preventing flattening.

                      regards, tom lane



 Thanks so much for the help!

 I don't care if the code is rearranged so that c is replaced with an
 inline definition during compilation.  I'm not concerned about
 efficiency here.  I just don't want to have to redefine it manually over
 and over again, because I know that as I update how c is defined, I'll
 forget to update it everywhere.

 Maybe sql needs a preprocessing macro language like C.


 Or maybe we can dust off my GENERATED column patch
 I posted here in 2006. :-)

 Best regards,
 Zoltán Böszörményi


You mean this?:
http://archives.postgresql.org/pgsql-hackers/2006-07/msg00543.php
And this?: http://archives.postgresql.org/pgsql-hackers/2006-08/msg00038.php
And this?: http://archives.postgresql.org/pgsql-patches/2007-04/msg00107.php

-- 
Thom Brown
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Alvaro Herrera
Excerpts from Boszormenyi Zoltan's message of lun ago 16 14:45:07 -0400 2010:
 Matthew Wilson írta:

  I don't care if the code is rearranged so that c is replaced with an
  inline definition during compilation.  I'm not concerned about
  efficiency here.  I just don't want to have to redefine it manually over
  and over again, because I know that as I update how c is defined, I'll
  forget to update it everywhere.
 
  Maybe sql needs a preprocessing macro language like C.

 
 Or maybe we can dust off my GENERATED column patch
 I posted here in 2006. :-)

Hmm, that seems entirely unrelated ...

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgresql's table index compared to that of MySQL

2010-08-16 Thread Andy
For the same data set, with mostly text data, how does the data (table + index) 
size of Postgresql compared to that of MySQL?

In this presentation, the largest blog site in Japan talked about their 
migration from Postgresql to MySQL. One of their reasons for moving away from 
Postgresql was that data size in Postgresql was too large (p. 12  p. 41). 
Specifically they talked about index being 40% of total data size: 

http://www.scribd.com/doc/2569473/Migrating-from-PostgreSQL-to-MySQL-at-Cocolog-Japans-Largest-Blog-Community

Are there any reasons why table  index sizes of Postgresql should be larger 
than MySQL? Postgresql uses MVCC while InnoDB does not use full MVCC, so 
perhaps that's a factor there.

Does anyone have any actual experience about how the data sizes of Postgresql  
MySQL compare to each other?

The company in the presentation used Postgresql 8.1. Has there been any 
significant changes in data size between 8.1 and 8.4/9.0?

Thanks.


  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql's table index compared to that of MySQL

2010-08-16 Thread Tom Lane
Andy angelf...@yahoo.com writes:
 Are there any reasons why table  index sizes of Postgresql should be larger 
 than MySQL?

Well, the per-row headers in PG are definitely bigger because of MVCC
requirements.  It's hard to speculate much about index sizes with
no information about the table/index schemas.

 The company in the presentation used Postgresql 8.1. Has there been any 
 significant changes in data size between 8.1 and 8.4/9.0?

Well, we shaved 4 bytes off the tuple header size since 8.1, and there's
been work on cutting per-field overhead too, and there's now some
ability to control fillfactor in indexes.  But there's really not enough
information here to say how much difference this might've made for them.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql's table index compared to that of MySQL

2010-08-16 Thread Thom Brown
On 16 August 2010 23:59, Andy angelf...@yahoo.com wrote:
 For the same data set, with mostly text data, how does the data (table + 
 index) size of Postgresql compared to that of MySQL?

 In this presentation, the largest blog site in Japan talked about their 
 migration from Postgresql to MySQL. One of their reasons for moving away from 
 Postgresql was that data size in Postgresql was too large (p. 12  p. 41). 
 Specifically they talked about index being 40% of total data size:

 http://www.scribd.com/doc/2569473/Migrating-from-PostgreSQL-to-MySQL-at-Cocolog-Japans-Largest-Blog-Community

 Are there any reasons why table  index sizes of Postgresql should be larger 
 than MySQL? Postgresql uses MVCC while InnoDB does not use full MVCC, so 
 perhaps that's a factor there.

 Does anyone have any actual experience about how the data sizes of Postgresql 
  MySQL compare to each other?

 The company in the presentation used Postgresql 8.1. Has there been any 
 significant changes in data size between 8.1 and 8.4/9.0?

 Thanks.


This is quite a crude and unrealistic test (as you'd need a set of
real-world data), but just did a quick test using PostgreSQL 9.0 alpha
4 and MySQL .  I created a new database in both PostgreSQL and MySQL.
Created the same table in both, and loaded identical data (50
paragraphs of lorem ipsum) and got it to insert the table's contents
back into itself until both reached 65,536 rows.  I also did a VACUUM
in PostgreSQL and an OPTIMIZE TABLE in MySQL.

PostgreSQL's table size shows 867 MB
MySQL's table size as MyISAM shows 2,542 MB
MySQL's table size as InnoDB shows: 3,576 MB

Also bear in mind that MySQL's InnoDB engine doesn't support full text
indexes, and when you can apply full text indexes, it only returns a
result if it matches less than 50% of the total rows in the table.

PostgreSQL provides GIN and GiST types of index which are used for
full text searches, but off the top of my head I don't know if either
is actually equivalent to MySQL's implementation.  I suspect they're
quite different.  Hopefully someone more familiar with both system's
full text search features can answer that.

-- 
Thom Brown
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql's table index compared to that of MySQL

2010-08-16 Thread Andy


--- On Mon, 8/16/10, Thom Brown t...@linux.com wrote:

 This is quite a crude and unrealistic test (as you'd need a
 set of
 real-world data), but just did a quick test using
 PostgreSQL 9.0 alpha
 4 and MySQL .  I created a new database in both
 PostgreSQL and MySQL.
 Created the same table in both, and loaded identical data
 (50
 paragraphs of lorem ipsum) and got it to insert the table's
 contents
 back into itself until both reached 65,536 rows.  I
 also did a VACUUM
 in PostgreSQL and an OPTIMIZE TABLE in MySQL.
 
 PostgreSQL's table size shows 867 MB
 MySQL's table size as MyISAM shows 2,542 MB
 MySQL's table size as InnoDB shows: 3,576 MB
 
 Also bear in mind that MySQL's InnoDB engine doesn't
 support full text
 indexes, and when you can apply full text indexes, it only
 returns a
 result if it matches less than 50% of the total rows in the
 table.
 
 PostgreSQL provides GIN and GiST types of index which are
 used for
 full text searches, but off the top of my head I don't know
 if either
 is actually equivalent to MySQL's implementation.  I
 suspect they're
 quite different.  Hopefully someone more familiar with
 both system's
 full text search features can answer that.
 

Thanks for doing the test.

Your results of 867MB for Postgresql  3,576 MB for InnoDB are surprising. Do 
you know why it is so much smaller for Postgresql? Are there any indexes?

Are all Postgresql indexes based on GIN  GiST? I'm not using the database for 
full text search, would I still be using GIN/GiST indexes, or would I be using 
the plain old B+ tree?




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Duda sobre campo null!!!

2010-08-16 Thread Lazaro Ruben Garcia Martinez
Hola a todos, desde hace mucho tiempo tengo una una duda relacionada con los 
campos nullos, quería saber si tanto el empleo como la presencia de los mismos 
ya sea en tablas pequeñas o grandes, pude afectar de alguna forma el 
rendimiento del sistema, ademas quisiera saber que inconvenientes puede 
presentar. 

Saludos y muchisimas gracias por su tiempo. 


Re: [GENERAL] Duda sobre campo null!!!

2010-08-16 Thread Digimer
On 10-08-16 09:51 PM, Lazaro Ruben Garcia Martinez wrote:
 Hola a todos, desde hace mucho tiempo tengo una una duda relacionada con
 los campos nullos, quería saber si tanto el empleo como la presencia de
 los mismos ya sea en tablas pequeñas o grandes, pude afectar de alguna
 forma el rendimiento del sistema, ademas quisiera saber que
 inconvenientes puede presentar.
 
 Saludos y muchisimas gracias por su tiempo.

Lo siento, pero esto es una lista de correo Inglés. Usted recibirá más
ayuda si usted puede volver a preguntar en Inglés.

Creo que NULL no tiene resultados reales afectados. Es sólo otro estado;
True, False, Null.


Google translated, sorry for the poor Spanish.

-- 
Digimer
E-Mail: li...@alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Duda sobre campo null!!!

2010-08-16 Thread Lazaro Ruben Garcia Martinez
Hello everyone, I have a doubt: 
A null value in a field could affect the performance of the database?? 

Regards. 
Thank you very much for your time. 

- Digimer li...@alteeve.com escribió: 
 On 10-08-16 09:51 PM, Lazaro Ruben Garcia Martinez wrote: 
  Hola a todos, desde hace mucho tiempo tengo una una duda relacionada con 
  los campos nullos, quería saber si tanto el empleo como la presencia de 
  los mismos ya sea en tablas pequeñas o grandes, pude afectar de alguna 
  forma el rendimiento del sistema, ademas quisiera saber que 
  inconvenientes puede presentar. 
  
  Saludos y muchisimas gracias por su tiempo. 
 
 Lo siento, pero esto es una lista de correo Inglés. Usted recibirá más 
 ayuda si usted puede volver a preguntar en Inglés. 
 
 Creo que NULL no tiene resultados reales afectados. Es sólo otro estado; 
 True, False, Null. 
 
  
 Google translated, sorry for the poor Spanish. 
 
 -- 
 Digimer 
 E-Mail: li...@alteeve.com 
 AN!Whitepapers: http://alteeve.com 
 Node Assassin: http://nodeassassin.org 


Re: [GENERAL] Duda sobre campo null!!!

2010-08-16 Thread Scott Marlowe
Not really.  NULLs are stored in a bit field.  Each null takes up one
bit in this field, and as you add more nulls, you get more bytes each
time you go over a multiple of 8.

Also, NULLS can be indexed in postgresql now.  Older versions could
not index nulls (actually they got indexed, but the entries couldn't
be referenced properly).

2010/8/16 Lazaro Ruben Garcia Martinez lgarc...@estudiantes.uci.cu:
 Hello everyone, I have a doubt:
 A null value in a field could affect the performance of the database??

 Regards.
 Thank you very much for your time.

 - Digimer li...@alteeve.com escribió:
 On 10-08-16 09:51 PM, Lazaro Ruben Garcia Martinez wrote:
  Hola a todos, desde hace mucho tiempo tengo una una duda relacionada con
  los campos nullos, quería saber si tanto el empleo como la presencia de
  los mismos ya sea en tablas pequeñas o grandes, pude afectar de alguna
  forma el rendimiento del sistema, ademas quisiera saber que
  inconvenientes puede presentar.
 
  Saludos y muchisimas gracias por su tiempo.

 Lo siento, pero esto es una lista de correo Inglés. Usted recibirá más
 ayuda si usted puede volver a preguntar en Inglés.

 Creo que NULL no tiene resultados reales afectados. Es sólo otro estado;
 True, False, Null.

 
 Google translated, sorry for the poor Spanish.

 --
 Digimer
 E-Mail:         li...@alteeve.com
 AN!Whitepapers: http://alteeve.com
 Node Assassin:  http://nodeassassin.org




-- 
To understand recursion, one must first understand recursion.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql's table index compared to that of MySQL

2010-08-16 Thread Tom Lane
Andy angelf...@yahoo.com writes:
 Your results of 867MB for Postgresql  3,576 MB for InnoDB are surprising. Do 
 you know why it is so much smaller for Postgresql? Are there any indexes?

If I understood the original report correctly, they were complaining
mostly about index size, so a table without indexes certainly isn't
a real helpful comparison.  Still, this brings up an important point:
AFAICS the paper doesn't even mention which mysql storage engine they're
using.  So it's *really* hard to tell what we're comparing to.

 Are all Postgresql indexes based on GIN  GiST?

No, certainly not.  See
http://www.postgresql.org/docs/8.4/static/indexes-types.html

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to do hot backup using postgres

2010-08-16 Thread tuanhoanganh
Can anyone answer me ?
Thanks you very much

Tuan Hoang Anh

On Sat, Aug 14, 2010 at 10:00 AM, tuanhoanganh hatua...@gmail.com wrote:

 I tried to do pitr backup using Postgres 8.3.9 on windows. So I issued
 SELECT pg_start_backup('test');
 After I put the db in backup mode I tried to zip the data directory files
 with 7z. However I encountered the following errors:

 The process cannot access the file because it is being used by another
 process.
 C:\...\8.3\data\global1214
 The process cannot access the file because it is being used by another
 process.
 C:\...\8.3\pg_xlog\00010007001B
 ...
 ...
 But zip file was created.
 Is it error ? How can I zip the files and perform a hot backup on windows?




Re: [GENERAL] Postgresql's table index compared to that of MySQL

2010-08-16 Thread Craig Ringer
On 17/08/10 06:59, Andy wrote:

 http://www.scribd.com/doc/2569473/Migrating-from-PostgreSQL-to-MySQL-at-Cocolog-Japans-Largest-Blog-Community
 
 Are there any reasons why table  index sizes of Postgresql should be larger 
 than MySQL? Postgresql uses MVCC while InnoDB does not use full MVCC, so 
 perhaps that's a factor there.

I can't help but wonder if they were doing regular VACUUM FULL
maintenance and being bitten by issues with index bloat as a
consequence. Though after reading the document linked, it's hard to
imagine they would be falling for a mistake that basic.

OTOH, a key clue there is:

  We don't use Auto VACUUM since we are worried about latent
  response time.

ie they wanted to avoid response latency spikes and thought/found they'd
be caused by autovacuum. Given that their discussion is of 7.4 to 8.1,
that's not unreasonable.

It looks to me like their *real* issue was just that they had problems
scaling a single PostgreSQL instance up to meet their needs, and were
having issues with vacuum under continuous high loads. They decided to
move to a sharded system, and moved over to MySQL at that time. That's
not necessarily a bad decision, as MySQL is more widely used in sharded
setups and there are some good tools to help with that sort of use.

It's pretty funny that they had issues with charset handling under Pg,
though, as most people seem to complain that Pg is too strict. Any ideas
what they might be talking about with:

  PostgreSQL allow the out of boundary UTF-8 japanese extended
   character sets and multi bytes character sets which should
   normally come back with an error - instead of accepting them

... and whether it's related to existing or past issues? Or might they
have just been using SQL_ASCII?

There are also some clues that the way they were doing things may not
have been ideal. For example, they mention as a problem with MySQL
that you get a different sort order without order by clause. I
struggle to interpret that in any way but that they were relying on the
undefined ordering of records in a query without an ORDER BY, which is
just silly.

--
Craig Ringer

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Relationship between ulimit and effective_cache_size, work_mem, shared_buffers, maintenance_work_mem and vacuum

2010-08-16 Thread Sandeep Srinivasa
Thanks for the comprehensive reply. It definitely cleared a lot of things up
for me.

regards
Sandeep

On Mon, Aug 16, 2010 at 12:39 PM, Greg Smith g...@2ndquadrant.com wrote:

 Sandeep Srinivasa wrote:


 I was looking at ways to optimize the postgres database and yet limit the
 amount of memory that it can consume.


 You can't usefully limit the amount of memory that PostgreSQL will consume
 yet.  Each client can easily allocate multiples of work_mem worth of memory
 as they run queries, and there's temp_buffers to be concerned about too.
  One way to work around problems like this is to significantly limit the
 number of clients that can be running at a time, using something like a
 connection pooler, and then keep a low setting for max_connections.  This
 can have some positive performance benefits of its own, and it will keep the
 number of clients (the only real source of variable allocations on the
 server) from getting too large.  Software suitable for that purpose includes
 pgbouncer and pgpool.


  Now, the effective cache size variable seems more of a hint to the query
 planner, than any hard limit on the database server.


 That's correct.  It doesn't allocate anything.  Doesn't limit anything
 either.


  Q1.  if I add ulimit -m and ulimit -v lines in my postgres upstart
 files  will that be good enough to hard-limit Postgres memory usage ?


 After fighting a few random crashes where the server runs into ulimit, you
 will find that trying to hard limit PostgreSQL memory usage is more trouble
 than it's worth.  It's really a bad place to go.


  Q2.  once I have decided my max memory allocation (call it MY_ULIMIT) -
 should effective cache size be set to MY_ULIMIT - 256 - 12 -20 ?  round it
 off to MY_ULIMIT - 512mb maybe


 effective_cache_size has no relation to the limits.  Take a look at how
 much of your operating system cache you think might be free at any time.
  Figure out what percentage of that you might want PostgreSQL to be able to
 use sometime.  Set effective_cache_size to it.  If a query goes wild and
 decides to execute a really bad query plan that reads a bunch of data, it
 will trash the operating system cache regardless; you can't stop it like
 this.


  Q3. Or will doing something like this play havoc with the query
 planner/unexplained OOM/crashes ?


 If you ever let the system get so low on RAM that the Linux OOM killer
 becomes active, it will almost always kill the main database server process,
 the one that spawns all the clients off, due to how Linux does shared memory
 accounting.  This is another really bad things to be avoided.


  1. will this affect the memory usage of vacuum (going to be using default
 vacuum settings for 8.4) - because ideally I would want to have some control
 over it as well.


 Each of the autovacuum processes (defaulting to 3) will use up to
 maintenance_work_mem worth of memory when they are running.  You should
 account for that when estimating peak usage.


  2. Would I have to tune max_connections, max_files_per_process (and any
 related variables) ?


 Limiting max_connections, and accordingly dealing with the fact that some
 connections might be refused temporarily in your application, is the most
 effective thing you can do here.  max_files_per_process is really secondary
 to any of the other bits you're asking about.


  3. When I turn on WAL, would I have to tune wal_buffers accordingly  set
 effective cache size to account for wal_buffers as well ?


 Set wal_buffers somewhere between 1MB and 16MB, include it in the general
 server shared memory overhead, and then ignore it.  It takes up a little
 memory but isn't nearly as important as these other bits.

 --
 Greg Smith  2ndQuadrant US  Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com   www.2ndQuadrant.us