Re: [GENERAL] Libpq memory leak

2010-09-24 Thread Bret S. Lambert
On Fri, Sep 24, 2010 at 06:11:31PM +0400, Dmitriy Igrishin wrote:
 Hey Vladimir,
 
 
 
 2010/9/24 Polyakov Vladimir vvpolya...@gmail.com
 
  Program written in C using Libpq, which receives large files (BYTEA)
  has a memory leak.
  I need to free ALL of the used memory after each sql query.
 
  after each call PQclear() I drop the buffer:
 conn-inBuffer = realloc(conn-inBuffer, 8192);
 conn-inBufSize = 8192;


This is a known unsafe use of the realloc() function. If if fails to
allocate memory, you just lost the conn-inBuffer, thus leaking memory
in your own code. Fix this first, and then see if you still have the
issue with memory leaks, because it's possible you're just leaking it
with a bad realloc() idiom.

 
  It works, but ..
  I noticed that in some cases PQclear() does not clear the memory.
  This happens only when the program receives certain files...
 
  Why do you need realloc() after PQclear()?
 
 -- 
 Regards,
 Dmitriy

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What Linux edition we should chose?

2010-06-01 Thread Bret S. Lambert
On Wed, Jun 02, 2010 at 01:32:44AM -0400, Greg Smith wrote:
 Nilesh Govindarajan wrote:
 I run my site (see my signature) on a self managed VPS. I was using
 the default PGSQL RPM from the fedora repository, the site was getting
 way slow. So I compiled all the stuff apache, php and postgresql with
 custom gcc flags, which improved performance like hell

And were the versions the same? If you're going to go to the
trouble of hand-compiling, I'm willing to bet that you went to
the trouble of finding more recent versions of the software.

That is not how you test things.

 
 Without breaking down how much of that speed increase was from
 Apache, PHP, and PostgreSQL respectively, I'm not sure what the
 people who package PostgreSQL can really learn from your data here.
 Reports on improving PostgreSQL performance by tweaking optimizer
 flags haven't been very repeatable for others when they've popped up
 in the past, so for all we know the bulk of your gain came from
 Apache and PHP optimizations.

Not to mention that compiler optimizations increase the chance of
hitting a compiler bug. Getting the wrong answer fast is not an
improvement over the right answer slow.

 
 -- 
 Greg Smith  2ndQuadrant US  Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com   www.2ndQuadrant.us
 
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Row ordering after CREATE TABLE AS...SELECT regexp_split_to_table(source_text, regexp) AS new_column

2010-02-23 Thread Bret S. Lambert
On Wed, Feb 24, 2010 at 07:51:54AM +0100, John Gage wrote:
 This is a two-part question:
 
 1) I have a source_text that I want to divide into smaller subunits
 that will be contained in rows in a column in a new table.  Is it
 absolutely certain that the initial order of the rows in the
 resultant table after this  operation:
 
 CREATE TABLE new_table AS SELECT regexp_split_to_table(source_text,
 E'regexp') as subunits FROM source_table;
 
 will be the same as the order of these subunits in the original
 text?  Emphasis *initial order*.

I'd put money on not; this is not what databases are designed for.

 
 2) I would like to be able to create a serial-type column during
 CREATE TABLE AS in the new table that memorizes this order so that
 I can reconstruct the original text using ORDER BY on that serial
 column.  However, I am stumped how to do that.  I do not see how to
 put the name of that column into my SELECT statement which generates
 the table, and I do not see where else to put it.  Please forgive my
 stupidity.

Pre- or append an increasing serial number to the data, and use that
as a column named initial_order or something else that will make
it clear to you and other users what it is, and then import.

But if you have the original data, in order, why do you need to be
able to reconstruct it from a database dump? It just looks like
adding a step to add a step, to me.

 
 The work-around to this problem has been to ALTER my table after
 its creation with a new serial-type column.  But this assumes that
 the answer to Question 1) above is always Yes.
 
 Thanking you for your understanding,
 
 John
 
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Multiple buffer cache?

2010-02-06 Thread Bret S. Lambert
On Fri, Feb 05, 2010 at 11:41:13PM +0200, Alexei Vladishev wrote:
 Greetings,
 
 Is there a way of configuring PostgreSQL so that one specific table would
 use, say, 4GB of buffer cache while other tables would use the rest?
 
 I would like to keep the table and its indexes always in hot
 state, so that
 other queries won't pollute this part of the buffer cache. It would ensure
 reliable performance and much less disk IOPS working with the table.

Fiddling with the buffer cache like that would require some sort of
OS support, if I'm not mistaken in what you're asking for.

And then, even if the support is there, you'd need to outline exactly
how you're planning on pushing this button.

Specifically, what's your usage pattern that would make this a
win for you?

If the table and its indexes can already fit into the buffer cache,
and it's as commonly accessed as you think it is, the OS should
probably have it cached anyway.

 
 Is it possible?
 
 Thanks for any hints!
 
 Alexei
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Multiple buffer cache?

2010-02-06 Thread Bret S. Lambert
On Sat, Feb 06, 2010 at 03:46:58PM +0200, Alexei Vladishev wrote:
 Bret,
 
 Thank you for your response!
 
 Greetings,
 
 Is there a way of configuring PostgreSQL so that one specific table would
 use, say, 4GB of buffer cache while other tables would use the rest?
 
 I would like to keep the table and its indexes always in hot
 state, so that
 other queries won't pollute this part of the buffer cache. It would ensure
 reliable performance and much less disk IOPS working with the table.
 
 Fiddling with the buffer cache like that would require some sort of
 OS support, if I'm not mistaken in what you're asking for.
 I am talking about PostgreSQL buffer cache not OS level. I believe
 it has nothing to do with
 OS support.

Well, kinda; but I'd been spending too much time doing admin, so I'd
completely spaced on Postgres terms when you used buffer cache, so
sorry for the mixup.

 
 It would be great to have support of multiple cache buffers assigned
 to different set of tables.
 Having this implemented, I would assign frequently accessed
 configuration tables (selects
 and updates) to one buffer and historical tables (lots of insert
 operations) to another buffer, so
 the sets would use independent buffers and won't affect each other.

Fair enough.

 
 And then, even if the support is there, you'd need to outline exactly
 how you're planning on pushing this button.
 
 Specifically, what's your usage pattern that would make this a
 win for you?
 Let me explain. I have a very busy application generating thousands
 of SQLs per second.
 There is an application level cache built into the application already.
 
 The important part is that once per hour the application writes
 collected data to huge historical
 tables (100M up-to billions of records, partitioned). Since it
 happens every hour database buffer
 cache is already overwritten by data and indexes of other tables, so
 the write operation is very
 slow and requires huge amount of disk seeks causing 50-100x drop of
 performance.

The disk seeks will happen regardless of what Postgres does, as the
OS pulls in new disk blocks to perform the write. If your OS' buffer
cache is large enough to hold all the data you need, then your
best bet is likely partitioning data across multiple disks, so that
queuing the archive reads doesn't get in the way of production reads.

As I'm a unix admin mostly, I'm not qualified to give advice on whether
or not that's possible, or how to do it if it is ;)

 
 So, my idea is to assign a separate buffer cache for the historical
 tables. It would guarantee that
 index data is always cached, so the write operation will be very fast.
 
 Is it possible? Is there any other techniques available?

If it were at all possible, I'd actually set up a secondary archiving
server (unless you need the historical data on tap for the production
system as well), either on another port on the same machine, or on
another machine which won't impact your production system if it has to
suddenly do a bunch of disk I/O, and log the history to that.

 
 If the table and its indexes can already fit into the buffer cache,
 and it's as commonly accessed as you think it is, the OS should
 probably have it cached anyway.
 I see what you are saying but the problem is that it is normally
 accessed once per hour only.
 
 Any thoughts?
 
 Alexei

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Multiple buffer cache?

2010-02-06 Thread Bret S. Lambert
On Sat, Feb 06, 2010 at 02:44:32PM +0100, C?dric Villemain wrote:
 2010/2/6 Bret S. Lambert bret.lamb...@gmail.com:
  On Fri, Feb 05, 2010 at 11:41:13PM +0200, Alexei Vladishev wrote:
  Greetings,
 
  Is there a way of configuring PostgreSQL so that one specific table would
  use, say, 4GB of buffer cache while other tables would use the rest?
 
  I would like to keep the table and its indexes always in hot
  state, so that
  other queries won't pollute this part of the buffer cache. It would ensure
  reliable performance and much less disk IOPS working with the table.
 
  Fiddling with the buffer cache like that would require some sort of
  OS support, if I'm not mistaken in what you're asking for.
 
  And then, even if the support is there, you'd need to outline exactly
  how you're planning on pushing this button.
 
  Specifically, what's your usage pattern that would make this a
  win for you?
 
  If the table and its indexes can already fit into the buffer cache,
  and it's as commonly accessed as you think it is, the OS should
  probably have it cached anyway.
 
 that's all true.
 
 I am working on pgfincore which allow in some way to prepare buffer cache.
 You need pg  8.4 and linux (probably working with bsd too)

Why do something with a non-portable interface? Most OSes support
coherently mmap(2)'ing disk blocks into memory; in fact, I'm somewhat
taken aback that the postgres buffer cache isn't implemented in that
manner, but I'm willing to give the devs credit for having not done
so for good reasons.

 
 I don't consider it ready fo rproduction, but fine for debugging
 things, if you reall care the buffer cache preload, tell me, I'll
 stabilize the code in a shorter time ;)
 http://villemain.org/projects/pgfincore
 
 
 
  Is it possible?
 
  Thanks for any hints!
 
  Alexei
 
  --
  Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-general
 
  --
  Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-general
 
 
 
 
 -- 
 C?dric Villemain

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] data dump help

2010-01-18 Thread Bret S. Lambert
On Mon, Jan 18, 2010 at 05:49:32PM -0600, Terry wrote:
 On Mon, Jan 18, 2010 at 5:07 PM, Terry td3...@gmail.com wrote:
  On Mon, Jan 18, 2010 at 4:48 PM, Andy Colson a...@squeakycode.net wrote:
  On 1/18/2010 4:08 PM, Terry wrote:
 
  Hello,
 
  Sorry for the poor subject. ?Not sure how to describe what I need
  here. ?I have an application that logs to a single table in pgsql.
  In order for me to get into our log management, I need to dump it out
  to a file on a periodic basis to get new logs. ?I am not sure how to
  tackle this. ?I thought about doing a date calculation and just
  grabbing the previous 6 hours of logs and writing that to a new log
  file and setting up a rotation like that. ?Unfortunately, the log
  management solution can't go into pgsql directly. ?Thoughts?
 
  Thanks!
 
 
  How about a flag in the db, like: dumped.
 
  inside one transactions you'd be safe doing:
 
  begin
  SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
  select * from log where dumped = 0;
  -- app code to format/write/etc
  update log set dumped = 1 where dumped = 0;
  commit;
 
  Even if other transactions insert new records, you're existing transaction
  wont see them, and the update wont touch them.
 
  -Andy
 
 
  I like your thinking but I shouldn't add a new column to this
  database. ?It's a 3rd party application.
 
 
 Although.  I really like your idea so I might create another table
 where I will log whether the data has been dumped or not.  I just need
 to come up with a query to check this with the other table.

Isn't this just over-engineering? Why not let the database do
the work, and add the column with a default value of 0, so that
you don't have to modify whatever 3rd-party app dumps the data:

ALTER TABLE log ADD COLUMN dumped boolean DEFAULT FALSE

(I don't do much ALTER TABLE, so that syntax may be all foobar'ed)

- Bret

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] data dump help

2010-01-18 Thread Bret S. Lambert
On Tue, Jan 19, 2010 at 06:35:10PM +1300, Andrej wrote:
 2010/1/19 Bret S. Lambert bret.lamb...@gmail.com:
 
  Isn't this just over-engineering? Why not let the database do
  the work, and add the column with a default value of 0, so that
  you don't have to modify whatever 3rd-party app dumps the data:
 
 But what if his third-party software does something silly like a select *
 on the table and then gets a hissy fit because the data doesn't match
 the expectations any longer?

He said his app logs there, so I kind of assumed that it's write-only
as far as the app is concerned. If not, then, yes, there could be
issues.

But why not keep things as simple as possible?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general