Re: [GENERAL] Relationship between ulimit and effective_cache_size, work_mem, shared_buffers, maintenance_work_mem and vacuum

2010-08-16 Thread Sandeep Srinivasa
Thanks for the comprehensive reply. It definitely cleared a lot of things up
for me.

regards
Sandeep

On Mon, Aug 16, 2010 at 12:39 PM, Greg Smith  wrote:

> Sandeep Srinivasa wrote:
>
>>
>> I was looking at ways to optimize the postgres database and yet limit the
>> amount of memory that it can consume.
>>
>
> You can't usefully limit the amount of memory that PostgreSQL will consume
> yet.  Each client can easily allocate multiples of work_mem worth of memory
> as they run queries, and there's temp_buffers to be concerned about too.
>  One way to work around problems like this is to significantly limit the
> number of clients that can be running at a time, using something like a
> connection pooler, and then keep a low setting for max_connections.  This
> can have some positive performance benefits of its own, and it will keep the
> number of clients (the only real source of variable allocations on the
> server) from getting too large.  Software suitable for that purpose includes
> pgbouncer and pgpool.
>
>
>  Now, the "effective cache size" variable seems more of a hint to the query
>> planner, than any hard limit on the database server.
>>
>
> That's correct.  It doesn't allocate anything.  Doesn't limit anything
> either.
>
>
>  Q1.  if I add "ulimit -m" and "ulimit -v" lines in my postgres upstart
>> files  will that be good enough to hard-limit Postgres memory usage ?
>>
>
> After fighting a few random crashes where the server runs into ulimit, you
> will find that trying to hard limit PostgreSQL memory usage is more trouble
> than it's worth.  It's really a bad place to go.
>
>
>  Q2.  once I have decided my max memory allocation (call it MY_ULIMIT) -
>> should effective cache size be set to MY_ULIMIT - 256 - 12 -20 ?  round it
>> off to MY_ULIMIT - 512mb maybe
>>
>
> effective_cache_size has no relation to the limits.  Take a look at how
> much of your operating system cache you think might be free at any time.
>  Figure out what percentage of that you might want PostgreSQL to be able to
> use sometime.  Set effective_cache_size to it.  If a query goes wild and
> decides to execute a really bad query plan that reads a bunch of data, it
> will trash the operating system cache regardless; you can't stop it like
> this.
>
>
>  Q3. Or will doing something like this play havoc with the query
>> planner/unexplained OOM/crashes ?
>>
>
> If you ever let the system get so low on RAM that the Linux OOM killer
> becomes active, it will almost always kill the main database server process,
> the one that spawns all the clients off, due to how Linux does shared memory
> accounting.  This is another really bad things to be avoided.
>
>
>  1. will this affect the memory usage of vacuum (going to be using default
>> vacuum settings for 8.4) - because ideally I would want to have some control
>> over it as well.
>>
>
> Each of the autovacuum processes (defaulting to 3) will use up to
> maintenance_work_mem worth of memory when they are running.  You should
> account for that when estimating peak usage.
>
>
>  2. Would I have to tune max_connections, max_files_per_process (and any
>> related variables) ?
>>
>
> Limiting max_connections, and accordingly dealing with the fact that some
> connections might be refused temporarily in your application, is the most
> effective thing you can do here.  max_files_per_process is really secondary
> to any of the other bits you're asking about.
>
>
>  3. When I turn on WAL, would I have to tune wal_buffers accordingly  set
>> effective cache size to account for wal_buffers as well ?
>>
>
> Set wal_buffers somewhere between 1MB and 16MB, include it in the general
> server shared memory overhead, and then ignore it.  It takes up a little
> memory but isn't nearly as important as these other bits.
>
> --
> Greg Smith  2ndQuadrant US  Baltimore, MD
> PostgreSQL Training, Services and Support
> g...@2ndquadrant.com   www.2ndQuadrant.us
>
>


Re: [GENERAL] Postgresql's table & index compared to that of MySQL

2010-08-16 Thread Craig Ringer
On 17/08/10 06:59, Andy wrote:

> http://www.scribd.com/doc/2569473/Migrating-from-PostgreSQL-to-MySQL-at-Cocolog-Japans-Largest-Blog-Community
> 
> Are there any reasons why table & index sizes of Postgresql should be larger 
> than MySQL? Postgresql uses MVCC while InnoDB does not use "full" MVCC, so 
> perhaps that's a factor there.

I can't help but wonder if they were doing regular VACUUM FULL
"maintenance" and being bitten by issues with index bloat as a
consequence. Though after reading the document linked, it's hard to
imagine they would be falling for a mistake that basic.

OTOH, a key clue there is:

  We don't use Auto VACUUM since we are worried about latent
  response time.

ie they wanted to avoid response latency spikes and thought/found they'd
be caused by autovacuum. Given that their discussion is of 7.4 to 8.1,
that's not unreasonable.

It looks to me like their *real* issue was just that they had problems
scaling a single PostgreSQL instance up to meet their needs, and were
having issues with vacuum under continuous high loads. They decided to
move to a sharded system, and moved over to MySQL at that time. That's
not necessarily a bad decision, as MySQL is more widely used in sharded
setups and there are some good tools to help with that sort of use.

It's pretty funny that they had issues with charset handling under Pg,
though, as most people seem to complain that Pg is too strict. Any ideas
what they might be talking about with:

  "PostgreSQL allow the out of boundary UTF-8 japanese extended
   character sets and multi bytes character sets which should
   normally come back with an error - instead of accepting them"

... and whether it's related to existing or past issues? Or might they
have just been using SQL_ASCII?

There are also some clues that the way they were doing things may not
have been ideal. For example, they mention as a "problem" with MySQL
that you get a "different sort order without order by clause". I
struggle to interpret that in any way but that they were relying on the
undefined ordering of records in a query without an ORDER BY, which is
just silly.

--
Craig Ringer

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to do hot backup using postgres

2010-08-16 Thread tuanhoanganh
Can anyone answer me ?
Thanks you very much

Tuan Hoang Anh

On Sat, Aug 14, 2010 at 10:00 AM, tuanhoanganh  wrote:

> I tried to do pitr backup using Postgres 8.3.9 on windows. So I issued
> SELECT pg_start_backup('test');
> After I put the db in backup mode I tried to zip the data directory files
> with 7z. However I encountered the following errors:
>
> The process cannot access the file because it is being used by another
> process.
> C:\...\8.3\data\global1214
> The process cannot access the file because it is being used by another
> process.
> C:\...\8.3\pg_xlog\00010007001B
> ...
> ...
> But zip file was created.
> Is it error ? How can I zip the files and perform a hot backup on windows?
>
>


Re: [GENERAL] Postgresql's table & index compared to that of MySQL

2010-08-16 Thread Tom Lane
Andy  writes:
> Your results of 867MB for Postgresql & 3,576 MB for InnoDB are surprising. Do 
> you know why it is so much smaller for Postgresql? Are there any indexes?

If I understood the original report correctly, they were complaining
mostly about index size, so a table without indexes certainly isn't
a real helpful comparison.  Still, this brings up an important point:
AFAICS the paper doesn't even mention which mysql storage engine they're
using.  So it's *really* hard to tell what we're comparing to.

> Are all Postgresql indexes based on GIN & GiST?

No, certainly not.  See
http://www.postgresql.org/docs/8.4/static/indexes-types.html

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Duda sobre campo null!!!

2010-08-16 Thread Scott Marlowe
Not really.  NULLs are stored in a bit field.  Each null takes up one
bit in this field, and as you add more nulls, you get more bytes each
time you go over a multiple of 8.

Also, NULLS can be indexed in postgresql now.  Older versions could
not index nulls (actually they got indexed, but the entries couldn't
be referenced properly).

2010/8/16 Lazaro Ruben Garcia Martinez :
> Hello everyone, I have a doubt:
> A null value in a field could affect the performance of the database??
>
> Regards.
> Thank you very much for your time.
>
> - "Digimer"  escribió:
>> On 10-08-16 09:51 PM, Lazaro Ruben Garcia Martinez wrote:
>> > Hola a todos, desde hace mucho tiempo tengo una una duda relacionada con
>> > los campos nullos, quería saber si tanto el empleo como la presencia de
>> > los mismos ya sea en tablas pequeñas o grandes, pude afectar de alguna
>> > forma el rendimiento del sistema, ademas quisiera saber que
>> > inconvenientes puede presentar.
>> >
>> > Saludos y muchisimas gracias por su tiempo.
>>
>> Lo siento, pero esto es una lista de correo Inglés. Usted recibirá más
>> ayuda si usted puede volver a preguntar en Inglés.
>>
>> Creo que NULL no tiene resultados reales afectados. Es sólo otro estado;
>> True, False, Null.
>>
>> 
>> Google translated, sorry for the poor Spanish.
>>
>> --
>> Digimer
>> E-Mail:         li...@alteeve.com
>> AN!Whitepapers: http://alteeve.com
>> Node Assassin:  http://nodeassassin.org
>>



-- 
To understand recursion, one must first understand recursion.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Duda sobre campo null!!!

2010-08-16 Thread Lazaro Ruben Garcia Martinez
Hello everyone, I have a doubt: 
A null value in a field could affect the performance of the database?? 

Regards. 
Thank you very much for your time. 

- "Digimer"  escribió: 
> On 10-08-16 09:51 PM, Lazaro Ruben Garcia Martinez wrote: 
> > Hola a todos, desde hace mucho tiempo tengo una una duda relacionada con 
> > los campos nullos, quería saber si tanto el empleo como la presencia de 
> > los mismos ya sea en tablas pequeñas o grandes, pude afectar de alguna 
> > forma el rendimiento del sistema, ademas quisiera saber que 
> > inconvenientes puede presentar. 
> > 
> > Saludos y muchisimas gracias por su tiempo. 
> 
> Lo siento, pero esto es una lista de correo Inglés. Usted recibirá más 
> ayuda si usted puede volver a preguntar en Inglés. 
> 
> Creo que NULL no tiene resultados reales afectados. Es sólo otro estado; 
> True, False, Null. 
> 
>  
> Google translated, sorry for the poor Spanish. 
> 
> -- 
> Digimer 
> E-Mail: li...@alteeve.com 
> AN!Whitepapers: http://alteeve.com 
> Node Assassin: http://nodeassassin.org 
>

Re: [GENERAL] Duda sobre campo null!!!

2010-08-16 Thread Digimer
On 10-08-16 09:51 PM, Lazaro Ruben Garcia Martinez wrote:
> Hola a todos, desde hace mucho tiempo tengo una una duda relacionada con
> los campos nullos, quería saber si tanto el empleo como la presencia de
> los mismos ya sea en tablas pequeñas o grandes, pude afectar de alguna
> forma el rendimiento del sistema, ademas quisiera saber que
> inconvenientes puede presentar.
> 
> Saludos y muchisimas gracias por su tiempo.

Lo siento, pero esto es una lista de correo Inglés. Usted recibirá más
ayuda si usted puede volver a preguntar en Inglés.

Creo que NULL no tiene resultados reales afectados. Es sólo otro estado;
True, False, Null.


Google translated, sorry for the poor Spanish.

-- 
Digimer
E-Mail: li...@alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Duda sobre campo null!!!

2010-08-16 Thread Lazaro Ruben Garcia Martinez
Hola a todos, desde hace mucho tiempo tengo una una duda relacionada con los 
campos nullos, quería saber si tanto el empleo como la presencia de los mismos 
ya sea en tablas pequeñas o grandes, pude afectar de alguna forma el 
rendimiento del sistema, ademas quisiera saber que inconvenientes puede 
presentar. 

Saludos y muchisimas gracias por su tiempo. 


Re: [GENERAL] Postgresql's table & index compared to that of MySQL

2010-08-16 Thread Andy


--- On Mon, 8/16/10, Thom Brown  wrote:

> This is quite a crude and unrealistic test (as you'd need a
> set of
> real-world data), but just did a quick test using
> PostgreSQL 9.0 alpha
> 4 and MySQL .  I created a new database in both
> PostgreSQL and MySQL.
> Created the same table in both, and loaded identical data
> (50
> paragraphs of lorem ipsum) and got it to insert the table's
> contents
> back into itself until both reached 65,536 rows.  I
> also did a VACUUM
> in PostgreSQL and an OPTIMIZE TABLE in MySQL.
> 
> PostgreSQL's table size shows 867 MB
> MySQL's table size as MyISAM shows 2,542 MB
> MySQL's table size as InnoDB shows: 3,576 MB
> 
> Also bear in mind that MySQL's InnoDB engine doesn't
> support full text
> indexes, and when you can apply full text indexes, it only
> returns a
> result if it matches less than 50% of the total rows in the
> table.
> 
> PostgreSQL provides GIN and GiST types of index which are
> used for
> full text searches, but off the top of my head I don't know
> if either
> is actually equivalent to MySQL's implementation.  I
> suspect they're
> quite different.  Hopefully someone more familiar with
> both system's
> full text search features can answer that.
> 

Thanks for doing the test.

Your results of 867MB for Postgresql & 3,576 MB for InnoDB are surprising. Do 
you know why it is so much smaller for Postgresql? Are there any indexes?

Are all Postgresql indexes based on GIN & GiST? I'm not using the database for 
full text search, would I still be using GIN/GiST indexes, or would I be using 
the plain old B+ tree?




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql's table & index compared to that of MySQL

2010-08-16 Thread Thom Brown
On 16 August 2010 23:59, Andy  wrote:
> For the same data set, with mostly text data, how does the data (table + 
> index) size of Postgresql compared to that of MySQL?
>
> In this presentation, the largest blog site in Japan talked about their 
> migration from Postgresql to MySQL. One of their reasons for moving away from 
> Postgresql was that data size in Postgresql was too large (p. 12 & p. 41). 
> Specifically they talked about index being 40% of total data size:
>
> http://www.scribd.com/doc/2569473/Migrating-from-PostgreSQL-to-MySQL-at-Cocolog-Japans-Largest-Blog-Community
>
> Are there any reasons why table & index sizes of Postgresql should be larger 
> than MySQL? Postgresql uses MVCC while InnoDB does not use "full" MVCC, so 
> perhaps that's a factor there.
>
> Does anyone have any actual experience about how the data sizes of Postgresql 
> & MySQL compare to each other?
>
> The company in the presentation used Postgresql 8.1. Has there been any 
> significant changes in data size between 8.1 and 8.4/9.0?
>
> Thanks.
>

This is quite a crude and unrealistic test (as you'd need a set of
real-world data), but just did a quick test using PostgreSQL 9.0 alpha
4 and MySQL .  I created a new database in both PostgreSQL and MySQL.
Created the same table in both, and loaded identical data (50
paragraphs of lorem ipsum) and got it to insert the table's contents
back into itself until both reached 65,536 rows.  I also did a VACUUM
in PostgreSQL and an OPTIMIZE TABLE in MySQL.

PostgreSQL's table size shows 867 MB
MySQL's table size as MyISAM shows 2,542 MB
MySQL's table size as InnoDB shows: 3,576 MB

Also bear in mind that MySQL's InnoDB engine doesn't support full text
indexes, and when you can apply full text indexes, it only returns a
result if it matches less than 50% of the total rows in the table.

PostgreSQL provides GIN and GiST types of index which are used for
full text searches, but off the top of my head I don't know if either
is actually equivalent to MySQL's implementation.  I suspect they're
quite different.  Hopefully someone more familiar with both system's
full text search features can answer that.

-- 
Thom Brown
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql's table & index compared to that of MySQL

2010-08-16 Thread Tom Lane
Andy  writes:
> Are there any reasons why table & index sizes of Postgresql should be larger 
> than MySQL?

Well, the per-row headers in PG are definitely bigger because of MVCC
requirements.  It's hard to speculate much about index sizes with
no information about the table/index schemas.

> The company in the presentation used Postgresql 8.1. Has there been any 
> significant changes in data size between 8.1 and 8.4/9.0?

Well, we shaved 4 bytes off the tuple header size since 8.1, and there's
been work on cutting per-field overhead too, and there's now some
ability to control fillfactor in indexes.  But there's really not enough
information here to say how much difference this might've made for them.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgresql's table & index compared to that of MySQL

2010-08-16 Thread Andy
For the same data set, with mostly text data, how does the data (table + index) 
size of Postgresql compared to that of MySQL?

In this presentation, the largest blog site in Japan talked about their 
migration from Postgresql to MySQL. One of their reasons for moving away from 
Postgresql was that data size in Postgresql was too large (p. 12 & p. 41). 
Specifically they talked about index being 40% of total data size: 

http://www.scribd.com/doc/2569473/Migrating-from-PostgreSQL-to-MySQL-at-Cocolog-Japans-Largest-Blog-Community

Are there any reasons why table & index sizes of Postgresql should be larger 
than MySQL? Postgresql uses MVCC while InnoDB does not use "full" MVCC, so 
perhaps that's a factor there.

Does anyone have any actual experience about how the data sizes of Postgresql & 
MySQL compare to each other?

The company in the presentation used Postgresql 8.1. Has there been any 
significant changes in data size between 8.1 and 8.4/9.0?

Thanks.


  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Alvaro Herrera
Excerpts from Boszormenyi Zoltan's message of lun ago 16 14:45:07 -0400 2010:
> Matthew Wilson írta:

> > I don't care if the code is rearranged so that c is replaced with an
> > inline definition during compilation.  I'm not concerned about
> > efficiency here.  I just don't want to have to redefine it manually over
> > and over again, because I know that as I update how c is defined, I'll
> > forget to update it everywhere.
> >
> > Maybe sql needs a preprocessing macro language like C.
> >   
> 
> Or maybe we can dust off my GENERATED column patch
> I posted here in 2006. :-)

Hmm, that seems entirely unrelated ...

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Thom Brown
2010/8/16 Boszormenyi Zoltan :
> Matthew Wilson írta:
>> On Mon Aug 16 10:26:36 2010, Tom Lane wrote:
>>
>>> Matthew Wilson  writes:
>>>
 All I can come up with so far is to use a view and then another view on
 top of that one:

>>> Note that you don't actually need a view, as you can just write the
>>> subselect in-line:
>>>
>>>      select a, b, c,
>>>      case when c < 0 then 'no'
>>>      else 'yes'
>>>      end as d
>>>      from (select a, b, a - b as c from foo) as v1;
>>>
>>> This is the standard method for avoiding repeat calculations in SQL.
>>>
>>> One thing to keep in mind is that the planner will usually try to
>>> "flatten" a nested sub-select (and whether it was written out manually
>>> or pulled from a view does not matter here).  This will result in the
>>> sub-select's expressions getting inlined into the parent, so that the
>>> calculations will actually get done more than once.  If you're trying
>>> to reduce execution time not just manual labor, you may want to put an
>>> "offset 0" into the sub-select to create an optimization fence.  But
>>> test whether that really saves anything --- if there are bigger joins
>>> or additional WHERE conditions involved, you can easily lose more than
>>> you gain by preventing flattening.
>>>
>>>                      regards, tom lane
>>>
>>>
>>
>> Thanks so much for the help!
>>
>> I don't care if the code is rearranged so that c is replaced with an
>> inline definition during compilation.  I'm not concerned about
>> efficiency here.  I just don't want to have to redefine it manually over
>> and over again, because I know that as I update how c is defined, I'll
>> forget to update it everywhere.
>>
>> Maybe sql needs a preprocessing macro language like C.
>>
>
> Or maybe we can dust off my GENERATED column patch
> I posted here in 2006. :-)
>
> Best regards,
> Zoltán Böszörményi
>

You mean this?:
http://archives.postgresql.org/pgsql-hackers/2006-07/msg00543.php
And this?: http://archives.postgresql.org/pgsql-hackers/2006-08/msg00038.php
And this?: http://archives.postgresql.org/pgsql-patches/2007-04/msg00107.php

-- 
Thom Brown
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Boszormenyi Zoltan
Matthew Wilson írta:
> On Mon Aug 16 10:26:36 2010, Tom Lane wrote:
>   
>> Matthew Wilson  writes:
>> 
>>> All I can come up with so far is to use a view and then another view on
>>> top of that one:
>>>   
>> Note that you don't actually need a view, as you can just write the
>> subselect in-line:
>>
>>  select a, b, c,
>>  case when c < 0 then 'no'
>>  else 'yes'
>>  end as d
>>  from (select a, b, a - b as c from foo) as v1;
>>
>> This is the standard method for avoiding repeat calculations in SQL.
>>
>> One thing to keep in mind is that the planner will usually try to
>> "flatten" a nested sub-select (and whether it was written out manually
>> or pulled from a view does not matter here).  This will result in the
>> sub-select's expressions getting inlined into the parent, so that the
>> calculations will actually get done more than once.  If you're trying
>> to reduce execution time not just manual labor, you may want to put an
>> "offset 0" into the sub-select to create an optimization fence.  But
>> test whether that really saves anything --- if there are bigger joins
>> or additional WHERE conditions involved, you can easily lose more than
>> you gain by preventing flattening.
>>
>>  regards, tom lane
>>
>> 
>
> Thanks so much for the help!
>
> I don't care if the code is rearranged so that c is replaced with an
> inline definition during compilation.  I'm not concerned about
> efficiency here.  I just don't want to have to redefine it manually over
> and over again, because I know that as I update how c is defined, I'll
> forget to update it everywhere.
>
> Maybe sql needs a preprocessing macro language like C.
>   

Or maybe we can dust off my GENERATED column patch
I posted here in 2006. :-)

Best regards,
Zoltán Böszörményi


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Encoding change question...

2010-08-16 Thread Karl Denninger
Peter C. Lai wrote:
> The doublequotes isn't UTF8 it's people copying and pasting from Microsoft
> stuff, which is WIN-1252. So try to use that with iconv instead of utf8
>
> On 2010-08-16 12:40:03PM -0500, Karl Denninger wrote:
>   
>> So I have myself a nice pickle here.
>>
>> I've got a database which was originally created with SQL_ASCII for the
>> encoding (anything goes text fields)
>>
>> Unfortunately, I have a bunch of data that was encoded in UTF-8 that's
>> in an RSS feed that I need to load into said database.  iconv barfs all
>> over this file in an attempt to turn it into ISO-8859 (which is what the
>> web application currently attached to that database is emitting and
>> collecting.)  It appears the problem is (mostly) things like the
>> stylized double-quotes.
>>
>> So I figured I'd go the other way, and convert what I have now in the
>> tables into UTF8.
>>
>> Well, except that doesn't work either.
>>
>> ticker=# select convert_to(subject, 'utf8') from post where
>> ordinal='2098167';
>>convert_to  
>> 
>>  1%: the interest rate on IBM\222s most recent three-year bond.
>> (1 row)
>>
>> \222 is the correct code point for the styled single apostrophe that is
>> in that place in ISO-8859-1 in the source.  However, the UTF prefix is
>> missing, as are the other two code-point characters (that is, I got the
>> code point but not the other two bytes that should be in front of it. 
>> And if I set the code page on the web site to UTF-8, and also set the
>> encoding on the SQL session to UTF-8 I don't get the three-byte code - I
>> just get the one byte.  That's a bust.
>>
>> There are TWO fields in this database that need converted.  I presumed I
>> could do something like this:
>>
>> # update post set field1 = convert_to(field1, 'utf8');
>>
>> It runs to completion without complaint but produces the above.  No good.
>>
>> So is there a way to do this?  I do NOT want to dump, iconv on the
>> dumped file, then reload the database if I can possibly avoid it.  Yes,
>> I know that will work (I've tested it on my development box), but being
>> able to do this "hot" would be DRAMATICALLY preferred.  If I do the
>> iconv route I am basically rebuilding the entire database with all
>> that's involved in doing so in terms of downtime and such.  VERY
>> undesirable.
>>
>> (Equally-valid would be a way to convert UTF-8 to ISO-8859-1 (Latin1),
>> thereby allowing me to convert the incoming data stream to what's
>> already in the system, but thus far I've found no joy on that at all.)
>>
>> Ideas?
>>
>> -- Karl
>> 
No, the problem is that the existing data in the database and the web
app that are using it are both ISO-8859-1.

The issue is that the data I need to load INTO the system is in UTF-8
(and really is), and is full of three-byte escapes.  iconv barfs all
over it trying to go to ISO-8859-1, so I can't convert the INCOMING data
to what's in the system now.  It also won't convert it to Windows-1252
(which is kind of a superset of ISO-8859, and thus the server might not
toss on it too badly.)

FS/karl:~/tmp> iconv -f UTF-8 -t WINDOWS-1252 /dev/null
iconv: (stdin):2766:6125: cannot convert

FS/karl:~/tmp> iconv -f UTF-8 -t ISO-8859-1 /dev/null 
iconv: (stdin):222:391: cannot convert

The data set that is incoming is some ~30,000 records.  The data in the
system already is ~2m records.  Obviously, if I can convert the incoming
data that's the better choice, but it appears I can't.

Thus the desire to go the other way - turn the existing data in the
database into UTF-8, which is probably desirable in the future anyway.

I can iconv a pg_dumpall of the database TO UTF-8 and it succeeds (at
least in theory), but the database into which I want to load this data
set is HUGE and dumping and reloading it isn't on my "A" list of things
to do.

If I have to I have to - but if I can avoid this I certainly WANT TO.

What I don't understand is why the Postgres function "convert_to()"
doesn't emit the THREE BYTE sequence, and only emits the codepoint.


-- Karl
<>
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Matthew Wilson
On Mon Aug 16 10:26:36 2010, Tom Lane wrote:
> Matthew Wilson  writes:
>> All I can come up with so far is to use a view and then another view on
>> top of that one:
>
> Note that you don't actually need a view, as you can just write the
> subselect in-line:
>
>  select a, b, c,
>  case when c < 0 then 'no'
>  else 'yes'
>  end as d
>  from (select a, b, a - b as c from foo) as v1;
>
> This is the standard method for avoiding repeat calculations in SQL.
>
> One thing to keep in mind is that the planner will usually try to
> "flatten" a nested sub-select (and whether it was written out manually
> or pulled from a view does not matter here).  This will result in the
> sub-select's expressions getting inlined into the parent, so that the
> calculations will actually get done more than once.  If you're trying
> to reduce execution time not just manual labor, you may want to put an
> "offset 0" into the sub-select to create an optimization fence.  But
> test whether that really saves anything --- if there are bigger joins
> or additional WHERE conditions involved, you can easily lose more than
> you gain by preventing flattening.
>
>   regards, tom lane
>

Thanks so much for the help!

I don't care if the code is rearranged so that c is replaced with an
inline definition during compilation.  I'm not concerned about
efficiency here.  I just don't want to have to redefine it manually over
and over again, because I know that as I update how c is defined, I'll
forget to update it everywhere.

Maybe sql needs a preprocessing macro language like C.



Matt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Encoding change question...

2010-08-16 Thread Peter C. Lai
The doublequotes isn't UTF8 it's people copying and pasting from Microsoft
stuff, which is WIN-1252. So try to use that with iconv instead of utf8

On 2010-08-16 12:40:03PM -0500, Karl Denninger wrote:
> So I have myself a nice pickle here.
> 
> I've got a database which was originally created with SQL_ASCII for the
> encoding (anything goes text fields)
> 
> Unfortunately, I have a bunch of data that was encoded in UTF-8 that's
> in an RSS feed that I need to load into said database.  iconv barfs all
> over this file in an attempt to turn it into ISO-8859 (which is what the
> web application currently attached to that database is emitting and
> collecting.)  It appears the problem is (mostly) things like the
> stylized double-quotes.
> 
> So I figured I'd go the other way, and convert what I have now in the
> tables into UTF8.
> 
> Well, except that doesn't work either.
> 
> ticker=# select convert_to(subject, 'utf8') from post where
> ordinal='2098167';
>convert_to  
> 
>  1%: the interest rate on IBM\222s most recent three-year bond.
> (1 row)
> 
> \222 is the correct code point for the styled single apostrophe that is
> in that place in ISO-8859-1 in the source.  However, the UTF prefix is
> missing, as are the other two code-point characters (that is, I got the
> code point but not the other two bytes that should be in front of it. 
> And if I set the code page on the web site to UTF-8, and also set the
> encoding on the SQL session to UTF-8 I don't get the three-byte code - I
> just get the one byte.  That's a bust.
> 
> There are TWO fields in this database that need converted.  I presumed I
> could do something like this:
> 
> # update post set field1 = convert_to(field1, 'utf8');
> 
> It runs to completion without complaint but produces the above.  No good.
> 
> So is there a way to do this?  I do NOT want to dump, iconv on the
> dumped file, then reload the database if I can possibly avoid it.  Yes,
> I know that will work (I've tested it on my development box), but being
> able to do this "hot" would be DRAMATICALLY preferred.  If I do the
> iconv route I am basically rebuilding the entire database with all
> that's involved in doing so in terms of downtime and such.  VERY
> undesirable.
> 
> (Equally-valid would be a way to convert UTF-8 to ISO-8859-1 (Latin1),
> thereby allowing me to convert the incoming data stream to what's
> already in the system, but thus far I've found no joy on that at all.)
> 
> Ideas?
> 
> -- Karl
> 

> begin:vcard
> fn:Karl Denninger
> n:Denninger;Karl
> email;internet:k...@denninger.net
> x-mozilla-html:TRUE
> version:2.1
> end:vcard
> 

> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


-- 
===
Peter C. Lai | Bard College at Simon's Rock
Systems Administrator| 84 Alford Rd.
Information Technology Svcs. | Gt. Barrington, MA 01230 USA
peter AT simons-rock.edu | (413) 528-7428
===


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Encoding change question...

2010-08-16 Thread Karl Denninger
So I have myself a nice pickle here.

I've got a database which was originally created with SQL_ASCII for the
encoding (anything goes text fields)

Unfortunately, I have a bunch of data that was encoded in UTF-8 that's
in an RSS feed that I need to load into said database.  iconv barfs all
over this file in an attempt to turn it into ISO-8859 (which is what the
web application currently attached to that database is emitting and
collecting.)  It appears the problem is (mostly) things like the
stylized double-quotes.

So I figured I'd go the other way, and convert what I have now in the
tables into UTF8.

Well, except that doesn't work either.

ticker=# select convert_to(subject, 'utf8') from post where
ordinal='2098167';
   convert_to  

 1%: the interest rate on IBM\222s most recent three-year bond.
(1 row)

\222 is the correct code point for the styled single apostrophe that is
in that place in ISO-8859-1 in the source.  However, the UTF prefix is
missing, as are the other two code-point characters (that is, I got the
code point but not the other two bytes that should be in front of it. 
And if I set the code page on the web site to UTF-8, and also set the
encoding on the SQL session to UTF-8 I don't get the three-byte code - I
just get the one byte.  That's a bust.

There are TWO fields in this database that need converted.  I presumed I
could do something like this:

# update post set field1 = convert_to(field1, 'utf8');

It runs to completion without complaint but produces the above.  No good.

So is there a way to do this?  I do NOT want to dump, iconv on the
dumped file, then reload the database if I can possibly avoid it.  Yes,
I know that will work (I've tested it on my development box), but being
able to do this "hot" would be DRAMATICALLY preferred.  If I do the
iconv route I am basically rebuilding the entire database with all
that's involved in doing so in terms of downtime and such.  VERY
undesirable.

(Equally-valid would be a way to convert UTF-8 to ISO-8859-1 (Latin1),
thereby allowing me to convert the incoming data stream to what's
already in the system, but thus far I've found no joy on that at all.)

Ideas?

-- Karl

<>
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] good exception handling archiecutre

2010-08-16 Thread Alban Hertroys
On 16 Aug 2010, at 10:45,  
 wrote:

> Hi Alban,
> 
> Thanks for the reply.
> 
> I checked that Notify will not work till the transaction is commited. So in 
> our case since the transaction is being rolled back there will be no notify.

Ah, you're right! I totally forgot about that...

> Is there any other way of inter process communication in Postgres.

Of course there is :)

There's nothing stopping you from writing a few custom functions in an 
untrusted language that does your IPC for you. C comes to mind, but PL/Perlu, 
PL/PythonU, etc. should be able to do it as well (and probably easier).

Usually if you're talking about performing I/O external to the database you'll 
get warned that transactions involving those calls can't be rolled back, but in 
this case that's exactly what you want.

Now I know that Postgres uses some IPC internally as well (at least shared 
memory), so maybe it's possible to use some of the internal mechanisms. I have 
no idea whether that's possible or at all advisable, I'm sure someone (probably 
Tom) will chime in regarding that.

Cheers!

> -Original Message-
> From: Alban Hertroys [mailto:dal...@solfertje.student.utwente.nl]
> Sent: 13 August 2010 17:50
> To: Atul Goel
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] good exception handling archiecutre
> 
> On 13 Aug 2010, at 14:07,  
>  wrote:
> 
>> How can we do error logging in Postgres. I am trying to create a 
>> LOG_AND_STOP method which would be generic to the code. I have pasted the 
>> code I have written. This code returns no data which is understandable as 
>> the error is thrown to the external world.
>> 
>> Can I write the code somehow.
>> 
>> CREATE SCHEMA test
>>  AUTHORIZATION postgres;
>> 
>> CREATE TABLE test.error_log
>> (
>>  error_desc text
>> );
>> 
>> 
>> CREATE OR REPLACE FUNCTION test.log_and_stop(err_desc text)
>>  RETURNS void AS
>> $BODY$
>> Declare
>> x integer;
>> begin
>> begin
>> insert into test.error_log values (err_desc);
>> end;
>> begin
>> raise exception '%',err_desc;
> 
> Your transaction got rolled back here, hence the lack of data in your 
> error_log table.
> 
> I don't think there's any way to log errors into a table from within the same 
> transaction, you'll probably need to use a separate connection for the 
> logging. I think LISTEN/NOTIFY may come in handy there.
> 
>> end;
>> end;
>> $BODY$
>>  LANGUAGE 'plpgsql' VOLATILE
>>  COST 100;
>> ALTER FUNCTION test.log_and_stop(text) OWNER TO postgres;
> 
> Alban Hertroys
> 
> --
> If you can't see the forest for the trees,
> cut the trees and you'll see there is no forest.
> 
> 
> 
> 
> 
> This e-mail is confidential and should not be used by anyone who is not the 
> original intended recipient. Global DataPoint Limited does not accept 
> liability for any statements made which are clearly the sender's own and not 
> expressly made on behalf of Global DataPoint Limited. No contracts may be 
> concluded on behalf of Global DataPoint Limited by means of e-mail 
> communication. Global DataPoint Limited Registered in England and Wales with 
> registered number 3739752 Registered Office Middlesex House, 34-42 Cleveland 
> Street, London W1T 4LB
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
> 

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4c697052967632122118857!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Wrong "ORDER BY" on a numeric value result

2010-08-16 Thread Stefan Wild
The answer is quite simple: this statement was composed by several functions 
used for ordering on string values as also on numeric. 


--- Thomas Kellerer  schrieb am So, 15.8.2010:

> Von: Thomas Kellerer 
> Betreff: Re: [GENERAL] Wrong "ORDER BY" on a numeric value result
> An: pgsql-general@postgresql.org
> Datum: Sonntag, 15. August, 2010 10:43 Uhr
> Stefan Wild wrote on 15.08.2010
> 10:36:
> >> column is numeric, but upper() works on text, and
> returns
> >> text, so your
> >> numeric column got casted to text by using upper
> (which is
> >> pointless
> >> anyway - there is no "upper" version of digits).
> >> remove upper() and you'll be fine.
> >>
> 
> > Thank you guys! That was the point.
> 
> The real question is:
> 
> what did you try to accomplish with the UPPER() on a
> numeric column?
> 
> Regards
> Thomas
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
> 




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Using record.items() method

2010-08-16 Thread Luís de Sousa
Hello everyone,

I'm trying to code a function that processes records from various tables
performing the same action in each case. The functions receives as an
argument a RECORD type variable and I'd like to iterate through each of its
elements. According to the documentation here:

http://python.projects.postgresql.org/pldocs/plpython3-postgres-types-record.html

there exists an items() method that returns a set of tuples each with the
attribute name and the associated value. How can use this method with
pl/pgsql?

Thank you,

Luís


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Eric Ndengang

Am 16.08.2010 14:45, schrieb Matthew Wilson:

I'm converting some procedural code to SQL as an experiment.  Here's the
pseudocode:

 c = a - b
 if c<  0 then d = 'no'
 else d = 'yes'

In SQL, I've got this:

 select a, b, a - b as c,
 case when a - b<  0 then 'no'
 else 'yes'
 end as d

 from foo;

This is a trivial example, but you can see how I calculate a - b two
separate times.

In reality, I have much nastier calculations and they happen more than
just twice.

I'm looking for an elegant solution for this puzzle.  I don't want to
repeat that a - b part over and over because I likely will need to
change how c gets defined and I don't want to have to change more than
one place in the code.

All I can come up with so far is to use a view and then another view on
top of that one:

 create view v1 as
 select a, b, a - b as c
 from foo;

 create view v2 as
 select a, b, c,
 case when c<  0 then 'no'
 else 'yes'
 end as d
 from v1;

This is better than the first solution because c is only defined in a
single place.  Is this the best possible solution?

Thanks for the help.

Matt


   
You can also use the ' with Queries ' option to solve this Problem like 
this:


with table_1 as (select a,b, a-b as c from foo)
Select a,b, c,
case when c<0 then 'no'
 else 'yes' end as d
 from table_1;
I hope , it will help you

--
Eric Ndengang
Junior Datenbankentwickler

Affinitas GmbH  |  Kohlfurter Straße 41/43  |  10999 Berlin  |  Germany
email: eric.ndengang_fo...@affinitas.de  | tel: +49.(0)30. 991 949 5 0  |  
www.edarling.de

Geschäftsführer: Lukas Brosseder, David Khalil, Kai Rieke, Christian Vollmann
Eingetragen beim Amtsgericht Berlin, HRB 115958


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Tom Lane
Matthew Wilson  writes:
> All I can come up with so far is to use a view and then another view on
> top of that one:

Note that you don't actually need a view, as you can just write the
subselect in-line:

 select a, b, c,
 case when c < 0 then 'no'
 else 'yes'
 end as d
 from (select a, b, a - b as c from foo) as v1;

This is the standard method for avoiding repeat calculations in SQL.

One thing to keep in mind is that the planner will usually try to
"flatten" a nested sub-select (and whether it was written out manually
or pulled from a view does not matter here).  This will result in the
sub-select's expressions getting inlined into the parent, so that the
calculations will actually get done more than once.  If you're trying
to reduce execution time not just manual labor, you may want to put an
"offset 0" into the sub-select to create an optimization fence.  But
test whether that really saves anything --- if there are bigger joins
or additional WHERE conditions involved, you can easily lose more than
you gain by preventing flattening.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How to refer to computed columns from other computed columns?

2010-08-16 Thread Matthew Wilson
I'm converting some procedural code to SQL as an experiment.  Here's the
pseudocode:

c = a - b
if c < 0 then d = 'no'
else d = 'yes'

In SQL, I've got this:

select a, b, a - b as c,
case when a - b < 0 then 'no'
else 'yes'
end as d

from foo;

This is a trivial example, but you can see how I calculate a - b two
separate times.

In reality, I have much nastier calculations and they happen more than
just twice.

I'm looking for an elegant solution for this puzzle.  I don't want to
repeat that a - b part over and over because I likely will need to
change how c gets defined and I don't want to have to change more than
one place in the code.

All I can come up with so far is to use a view and then another view on
top of that one:

create view v1 as
select a, b, a - b as c
from foo;

create view v2 as
select a, b, c,
case when c < 0 then 'no'
else 'yes'
end as d
from v1;

This is better than the first solution because c is only defined in a
single place.  Is this the best possible solution?

Thanks for the help.

Matt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] good exception handling archiecutre

2010-08-16 Thread Adrian von Bidder
Heyho!

[[ logging from postgres ]]

I think you can use dblink to do this - everything you do via dblink happens 
in a separate db session and so it shouldn't be affected by what happens in 
your transaction.

Depending on your requirements, system resources may be an issue though 
(you're opening a full additional db session.)

cheers
-- vbi


-- 
Although this syntax is required by the SQL standard, it's not clear
what the standard expects it to do, except that it's most likely not
what we were doing.
-- PostgreSQL 8.3 release notes


signature.asc
Description: This is a digitally signed message part.


Re: [GENERAL] InitDB: Bad system call

2010-08-16 Thread Torsten Zühlsdorff

Hello,


... The simplest explanation
I can think of is that it's *only* shmctl that is malfunctioning, not
the other SysV shared memory calls.  Which is even weirder, and
definitely seems to move the problem into the category of kernel bug
rather than configuration mistake.


Hmmm ... Google turned up the information that FreeBSD migrated from int
to size_t variables for shared memory size between 7.0 and 8.0, and in
particular that the size of the struct used by shmctl() changed in
8.0.  So I'm now wondering if what you're dealing with is some sort of
version skew problem.  Could it be that you built Postgres against
system header files that don't match your kernel version?  I'm not
exactly sure how that would manifest as this particular signal,
but it seems worth checking.


I have the correct header files, but that brings me to an interesting 
notice and a workaround.


Before i had build the new jail, i checked out the newest sources for 
FreeBSD 7.0 and recompile the world. With the new "world" i build the 
jail and the problems occurs.
Meanwhile there are two running jails with postgresql in at the same 
server. And IPC-problems seems unfamiliar to me, because the 
error-messages normally looks very different and other instances running 
without problems;)


What i've done now, was disableing an old jail and copy it to an new 
location. After some reconfiguration i use the copy as new jail and 
install postgresql. And it works.


That fortify your assumption, that the problem must lie in FreeBSD. But 
this will be hard to debug, because the last "make world" was 3 years 
ago of the machine. I will discribe the problem to the FreeBSD-Community.


Thanks for all your help and time,
Torsten

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] good exception handling archiecutre

2010-08-16 Thread Atul.Goel
Hi Alban,

Thanks for the reply.

I checked that Notify will not work till the transaction is commited. So in our 
case since the transaction is being rolled back there will be no notify. I am 
new to Postgres so i am not sure how to run LISTEN as demon so that it keeps on 
running to be notified.

Here is the extract form 
http://www.postgresql.org/docs/8.4/static/sql-notify.html
"NOTIFY interacts with SQL transactions in some important ways. Firstly, if a 
NOTIFY is executed inside a transaction, the notify events are not delivered 
until and unless the transaction is committed. This is appropriate, since if 
the transaction is aborted, all the commands within it have had no effect, 
including NOTIFY."

Is there any other way of inter process communication in Postgres.

Regards,
Atul Goel

-Original Message-
From: Alban Hertroys [mailto:dal...@solfertje.student.utwente.nl]
Sent: 13 August 2010 17:50
To: Atul Goel
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] good exception handling archiecutre

On 13 Aug 2010, at 14:07,  
 wrote:

> How can we do error logging in Postgres. I am trying to create a LOG_AND_STOP 
> method which would be generic to the code. I have pasted the code I have 
> written. This code returns no data which is understandable as the error is 
> thrown to the external world.
>
> Can I write the code somehow.
>
> CREATE SCHEMA test
>   AUTHORIZATION postgres;
>
> CREATE TABLE test.error_log
> (
>   error_desc text
> );
>
>
> CREATE OR REPLACE FUNCTION test.log_and_stop(err_desc text)
>   RETURNS void AS
> $BODY$
> Declare
> x integer;
> begin
> begin
> insert into test.error_log values (err_desc);
> end;
> begin
> raise exception '%',err_desc;

Your transaction got rolled back here, hence the lack of data in your error_log 
table.

I don't think there's any way to log errors into a table from within the same 
transaction, you'll probably need to use a separate connection for the logging. 
I think LISTEN/NOTIFY may come in handy there.

> end;
> end;
> $BODY$
>   LANGUAGE 'plpgsql' VOLATILE
>   COST 100;
> ALTER FUNCTION test.log_and_stop(text) OWNER TO postgres;

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:1157,4c6577d2967638551718946!


This e-mail is confidential and should not be used by anyone who is not the 
original intended recipient. Global DataPoint Limited does not accept liability 
for any statements made which are clearly the sender's own and not expressly 
made on behalf of Global DataPoint Limited. No contracts may be concluded on 
behalf of Global DataPoint Limited by means of e-mail communication. Global 
DataPoint Limited Registered in England and Wales with registered number 
3739752 Registered Office Middlesex House, 34-42 Cleveland Street, London W1T 
4LB

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Relationship between ulimit and effective_cache_size, work_mem, shared_buffers, maintenance_work_mem and vacuum

2010-08-16 Thread Greg Smith

Sandeep Srinivasa wrote:


I was looking at ways to optimize the postgres database and yet limit 
the amount of memory that it can consume.


You can't usefully limit the amount of memory that PostgreSQL will 
consume yet.  Each client can easily allocate multiples of work_mem 
worth of memory as they run queries, and there's temp_buffers to be 
concerned about too.  One way to work around problems like this is to 
significantly limit the number of clients that can be running at a time, 
using something like a connection pooler, and then keep a low setting 
for max_connections.  This can have some positive performance benefits 
of its own, and it will keep the number of clients (the only real source 
of variable allocations on the server) from getting too large.  Software 
suitable for that purpose includes pgbouncer and pgpool.


Now, the "effective cache size" variable seems more of a hint to the 
query planner, than any hard limit on the database server.


That's correct.  It doesn't allocate anything.  Doesn't limit anything 
either.


Q1.  if I add "ulimit -m" and "ulimit -v" lines in my postgres upstart 
files  will that be good enough to hard-limit Postgres memory usage ?


After fighting a few random crashes where the server runs into ulimit, 
you will find that trying to hard limit PostgreSQL memory usage is more 
trouble than it's worth.  It's really a bad place to go.


Q2.  once I have decided my max memory allocation (call it MY_ULIMIT) 
- should effective cache size be set to MY_ULIMIT - 256 - 12 -20 ? 
 round it off to MY_ULIMIT - 512mb maybe


effective_cache_size has no relation to the limits.  Take a look at how 
much of your operating system cache you think might be free at any 
time.  Figure out what percentage of that you might want PostgreSQL to 
be able to use sometime.  Set effective_cache_size to it.  If a query 
goes wild and decides to execute a really bad query plan that reads a 
bunch of data, it will trash the operating system cache regardless; you 
can't stop it like this.


Q3. Or will doing something like this play havoc with the query 
planner/unexplained OOM/crashes ?


If you ever let the system get so low on RAM that the Linux OOM killer 
becomes active, it will almost always kill the main database server 
process, the one that spawns all the clients off, due to how Linux does 
shared memory accounting.  This is another really bad things to be avoided.


1. will this affect the memory usage of vacuum (going to be using 
default vacuum settings for 8.4) - because ideally I would want to 
have some control over it as well.


Each of the autovacuum processes (defaulting to 3) will use up to 
maintenance_work_mem worth of memory when they are running.  You should 
account for that when estimating peak usage.


2. Would I have to tune max_connections, max_files_per_process (and 
any related variables) ?


Limiting max_connections, and accordingly dealing with the fact that 
some connections might be refused temporarily in your application, is 
the most effective thing you can do here.  max_files_per_process is 
really secondary to any of the other bits you're asking about.


3. When I turn on WAL, would I have to tune wal_buffers accordingly 
 set effective cache size to account for wal_buffers as well ?


Set wal_buffers somewhere between 1MB and 16MB, include it in the 
general server shared memory overhead, and then ignore it.  It takes up 
a little memory but isn't nearly as important as these other bits.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general