Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-08 Thread Tom Lane
=?ISO-8859-1?Q?Christian_Schr=F6der?= <[EMAIL PROTECTED]> writes:
> any news about this issue?

Not a lot.  I've been able to confirm on my own Fedora 6 machine that
the mere act of loading plperl.so into the backend causes the process
to have one thread instead of no threads, as reported by gdb.  How
relevant that is isn't clear.  I experimented with executing a plperl
function and then kill -9'ing the connected psql frontend process
while executing a query that returned a lot of output.  But all I could
get out of that was

LOG:  could not send data to client: Broken pipe
STATEMENT:  select * from generate_series(1,100);
LOG:  could not receive data from client: Connection reset by peer
LOG:  unexpected EOF on client connection

The first of these lines shows that control passed unscathed through
the place where your stack trace shows it was hung up.  So there is
something different between your situation and my test.  It could be
that the flaky-internet-connection case is different from my
kill-the-client test, but it's pretty hard to see how that would matter
in userland, especially not for the act of calling strerror() after
control has already returned from the kernel.  What I think is that the
perl stuff your session has done has included some action that changed
the condition of the backend process ... exactly what, I have no idea.
Can you show us the plperl functions that were used in these sessions?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] pg_ctl & show all

2007-11-08 Thread smiley2211

I was able to figure this out...

Thanks all...Michelle


smiley2211 wrote:
> 
> Hello all,
> 
> I changed my postgresql.conf settings and restarted postmaster show
> changes would take affect however when I type 'show all' I don't see the
> changes reflected...how do I get this file to be reloaded??
> 
> effective_cache (requires restart)
> shared_buffers (requires restart)
> 
> version 8.2.5
> 
> Thanks...Michelle
> 

-- 
View this message in context: 
http://www.nabble.com/pg_ctl---show-all-tf4774691.html#a13659995
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-08 Thread Tom Lane
=?ISO-8859-1?Q?Christian_Schr=F6der?= <[EMAIL PROTECTED]> writes:
> I don't want to "kill -9" the processes because the last time 
> I did this the database was in recovery mode for a substantial amount of 
> time.

A useful tip on that: if you perform a manual CHECKPOINT just before
issuing the kills, recovery time should be minimal.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] pg_ctl & show all

2007-11-08 Thread smiley2211

Hello all,

I changed my postgresql.conf settings and restarted postmaster show changes
would take affect however when I type 'show all' I don't see the changes
reflected...how do I get this file to be reloaded??

effective_cache (requires restart)
shared_buffers (requires restart)

version 8.2.5

Thanks...Michelle
-- 
View this message in context: 
http://www.nabble.com/pg_ctl---show-all-tf4774691.html#a13658884
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Optimal time series sampling.

2007-11-08 Thread Gregory Stark
"Ted Byers" <[EMAIL PROTECTED]> writes:

> As a prelude to where I really want to go, please
> consider the following SELECT statement.
>
>   SELECT close_price FROM stockprices A
>   WHERE price_date =
>  (SELECT MAX(price_date) FROM stockprices B
>   WHERE A.stock_id = B.stock_id AND A.stock_id = id);

I assume you're missing another "stock_id = id" on the outer query?

I think you'll have to post the actual explain analyze output you're getting
and the precise schema you have. You might need an index on
.

> It appears to do the right thing.  I certainly get the
> right answer, but I am not seeing a significant
> difference in performance.  Worse, when I invoke
> something like it for a suite of about two dozen
> stocks, it takes about ten minutes to complete.  

That would be an entirely different ball of wax than trying to pull out a
single stock's closing price. I suspect you're going to want to use Postgres's
"DISTINCT ON" SQL extension. Something like:

SELECT DISTINCT ON (stock_id,price_date) *
  FROM stockprices
 ORDER BY stock_id, price_date DESC

And you may want an index on < stock_id, price_date DESC >

I believe MySQL does have a similar extension where you can use GROUP BY and
have columns listed in the select target list which aren't included in the
grouping sets.

> So I need a more complex select statement that will just select the most
> recent price for a given stock for each week (or month or quarter or year).

Do you care what happens if there were no trades for a given stock in the time
period? The query you give above using MAX would still work but the query I
described using DISTINCT ON would not emit a record for the stock at all.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-08 Thread Christian Schröder

Hi all,
any news about this issue? Anything else that I can do to help you? 
Meanwhile there are 4 connections in the same state. (I did not do the 
whole investigation on all 4, but since they all do not respond on a 
SIGINT I assume that they all have the same problem.)
It may also be interesting that the 4 processes belong to the same two 
users that already caused this problem earlier. Maybe it really has 
something to do with their unstable internet connection? (I mentioned 
this in an earlier mail.)
I have also noticed that one of these two users has many open 
connections which are all idle. I guess that those connections are in 
fact dead, but the database did not close them for any reason. The 
pg_stat_activity entries for the corresponding backend processes are as 
follows:


procpid | usename | current_query | waiting |  
query_start  | backend_start

-+-+---+-+---+---
  26033 | dpyrek  | | f   | 2007-11-08 
10:21:01.555232+01 | 2007-11-08 09:55:01.59932+01
  18331 | dpyrek  | | f   | 2007-11-07 
11:34:24.968852+01 | 2007-11-07 11:08:29.043762+01
  18940 | dpyrek  | | f   | 2007-11-07 
14:29:52.987176+01 | 2007-11-07 13:14:48.609031+01
  25868 | dpyrek  | | f   | 2007-11-08 
09:47:46.938991+01 | 2007-11-08 09:13:34.101351+01
   6719 | dpyrek  | | f   | 2007-11-06 
12:06:14.875588+01 | 2007-11-06 11:10:00.566644+01
  17987 | dpyrek  | | f   | 2007-11-07 
10:31:50.517275+01 | 2007-11-07 10:11:07.310338+01
  31808 | dpyrek  | | f   | 2007-11-08 
22:55:03.931727+01 | 2007-11-08 22:55:03.766638+01
  25484 | dpyrek  | | f   | 2007-11-08 
08:32:57.265377+01 | 2007-11-08 07:44:30.845967+01
   5972 | dpyrek  | | f   | 2007-11-06 
08:51:54.57437+01  | 2007-11-06 08:14:03.560602+01
   6241 | dpyrek  | | f   | 2007-11-06 
09:59:02.018452+01 | 2007-11-06 09:20:49.092246+01
   6136 | dpyrek  | | f   | 2007-11-06 
09:14:40.729837+01 | 2007-11-06 08:57:29.55187+01
  12645 | dpyrek  | | f   | 2007-11-02 
10:08:24.856929+01 | 2007-11-02 09:35:37.640976+01
  25254 | dpyrek  | | f   | 2007-11-08 
07:29:04.547081+01 | 2007-11-08 06:33:47.707759+01
  20275 | dpyrek  | | f   | 2007-11-03 
09:14:12.73829+01  | 2007-11-03 08:57:05.555972+01
  20216 | dpyrek  | | f   | 2007-11-03 
08:46:40.555354+01 | 2007-11-03 08:40:31.756993+01
  12435 | dpyrek  | | f   | 2007-11-02 
09:28:53.361365+01 | 2007-11-02 08:48:11.589485+01
  19633 | dpyrek  | | f   | 2007-11-03 
08:34:16.263487+01 | 2007-11-03 05:46:16.811069+01
  12156 | dpyrek  | | f   | 2007-11-02 
08:10:11.558427+01 | 2007-11-02 07:49:03.442489+01
   4899 | dpyrek  | | f   | 2007-11-01 
12:42:30.880391+01 | 2007-11-01 10:56:18.513398+01
  11988 | dpyrek  | | f   | 2007-11-02 
07:38:10.315758+01 | 2007-11-02 07:02:52.438251+01
   4490 | dpyrek  | | f   | 2007-11-01 
09:51:42.216745+01 | 2007-11-01 09:34:18.63771+01


A ptrace of one of these connections yields the following result:

db2:/home/pgsql/data # strace -p 4899
Process 4899 attached - interrupt to quit
futex(0x994000, FUTEX_WAIT, 2, NULL

This looks identical (even with the same uaddr parameter) for the other 
processes.


In the log file I find many lines like this:

<2007-11-08 22:56:19 CET - dpyrek> LOG:  could not receive data from 
client: Die Wartezeit für die Verbindung ist abgelaufen

<2007-11-08 22:56:19 CET - dpyrek> LOG:  unexpected EOF on client connection

I'm not sure that these lines correspond to the dead connections, but at 
least it's the same user.


Does this additional information help you in any way? I'm a little bit 
afraid that eventually the maximum number of connections will be 
exceeded. I don't want to "kill -9" the processes because the last time 
I did this the database was in recovery mode for a substantial amount of 
time.


Any help is still highly appreciated!

Regards,
   Christian

P.S.: If nothing else helps I could also grant one of you guys root 
access to the database machine.


--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] "Resurrected" data files - problem?

2007-11-08 Thread Simon Riggs
On Thu, 2007-11-08 at 17:11 +0100, Albe Laurenz wrote:
> Tom Lane wrote:
> >>> So if we perform our database backups with incremental
> >>> backups as described above, we could end up with additional
> >>> files after the restore, because PostgreSQL files can get
> >>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE).
> >>> 
> >>> Could such "resurrected" files (data files, files in
> >>> pg_xlog, pg_clog or elsewhere) cause a problem for the database
> >>> (other than the obvious one that there may be unnecessary files
> >>> about that consume disk space)?
> >> 
> >> This will not work at all.
> > 
> > To be more specific: the resurrected files aren't the problem;
> > offhand I see no reason they'd create any issue beyond wasted
> > disk space.  The problem is version skew between files that were
> > backed up at slightly different times, leading to inconsistency.
> 
> I should have mentioned that before the (incremental) backup
> there would be a pg_start_backup() and a pg_stop_backup()
> afterwards, and we would use PITR.
> 
> So there could only be three kinds of files:
> - Files that did not change since the full backup, restored
>   from there. They should therefore look exactly as if the
>   online backup were performed in the normal way.
> - Files that have changed or are new, restored from the
>   incremental backup. These will also be ok, because
>   they were backed up between pg_start_backup() and
>   pg_stop_backup().
> - Files that have been deleted between full and incremental
>   backup and have been resurrected.
> 
> This third group is the only one which might be problematic,
> as far as I can see, because PostgreSQL will no expect them to
> be there.
> 
> The version skew between files backed up at slightly different
> times should be taken care of by PITR, shouldn't it?

The backup is not instantaneous, so there is no single time at which the
hot backup takes place. So deciding whether a file has changed based
upon a comparison of two file timestamps cannot work. You would need to
take timestamps for the file both before the pg_start_backup() and after
the pg_stop_backup() of the file for both full and incremental backups.
If all four timestamps are equivalent, then you are safe.

The relfilenode ids are potentially reused after a period of time, so
that could cause errors if not catered for on the incremental restore.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] subselect field "problem"

2007-11-08 Thread Thomas H.


i was writing the query below containing a subquery. by mistake, i 
referenced a field from the main table in the subquery, leading to a 
very strange (but working??) result. the planner announced a insanely 
high startup cost, but the query itself finished pretty quickly.



Pick up any SQL book and read up on correlated subqueries.

Have a nice day,
  
thanks! i didn't knew this term. the result is pretty obvious now, and 
so is the high cost prediction.


regards,
thomas


Re: [GENERAL] subselect field "problem"

2007-11-08 Thread Martijn van Oosterhout
On Thu, Nov 08, 2007 at 09:35:19PM +0100, Thomas H. wrote:
> i was writing the query below containing a subquery. by mistake, i 
> referenced a field from the main table in the subquery, leading to a 
> very strange (but working??) result. the planner announced a insanely 
> high startup cost, but the query itself finished pretty quickly.

Pick up any SQL book and read up on correlated subqueries.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution 
> inevitable.
>  -- John F Kennedy


signature.asc
Description: Digital signature


Re: [GENERAL] System V IPC on Windows

2007-11-08 Thread Magnus Hagander
Lee Keel wrote:
>> -Original Message-
>> From: [EMAIL PROTECTED] [mailto:pgsql-general-
>> [EMAIL PROTECTED] On Behalf Of Magnus Hagander
>> Sent: Thursday, November 08, 2007 1:30 PM
>> To: Kevin Neufeld
>> Cc: Tom Lane; pgsql-general@postgresql.org
>> Subject: Re: [GENERAL] System V IPC on Windows
>>
>> Pg on win32 is 32-bit. It has a total address space of 2Gb, minus the OS
>> overhead, minus the code, minus local memory etc. You're just not going
>> to fit that much in the address space.
>>
>> There are fixes to reduce the memory usage of the postmaster (which is
>> likely what runs out first, unless you have a large work_mem) in 8.3,
>> but you're still right up against the wall with that large
>> shared_buffers. Plus, as I said in my other email, you're likely not
>> seeing any performance gain from such a large shared_buffers anyway. If
>> you are, you're seeing something new, and we definitely need to find out
>> why.
>>
>>
>> //Magnus
>>
>>
> [Lee Keel] 
> 
> I can't do any benchmarks because I keep getting errors.  But I have dropped
> this value down and I am not getting the out of memory errors any more.  I
> was trying to solve other problems by bumping this way up but it seemed to

You could try benchmarking in-between levels, like comparing 128Mb to
512Mb. A couple of those should show you a trend.


> just cause more problems.  I found several things in documentation that says
> that bumping the shared_buffers\work_mem up to over 1gb was fine.  Is that
> strictly for linux boxes?

No, it's valid for most platforms that aren't Windows. But it's
generally not valid for Windows.

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] subselect field "problem"

2007-11-08 Thread Thomas H.

hi list

i was writing the query below containing a subquery. by mistake, i 
referenced a field from the main table in the subquery, leading to a 
very strange (but working??) result. the planner announced a insanely 
high startup cost, but the query itself finished pretty quickly.


nevertheless, shouldn't pgsql warn the user if he's referencing a 
non-existing field in a subquery? the field referenced in the subqueries 
WHERE-clause doesn't exist in the subqueries table, thus i don't even 
understand why that wouldn't throw an error and how the result would 
have to be interpreted:


SELECT * FROM titles
WHERE tit_id IN
(
   SELECT DISTINCT nam_tit_id
   FROM names
   WHERE lower(tit_name) LIKE '%best%'
)

the field "tit_name" is in "titles". the field i intented to use was 
"nam_name" from table "names"...


regards,
thomas




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] what is the date format in binary query results

2007-11-08 Thread Alvaro Herrera
Reg Me Please wrote:
> Il Thursday 08 November 2007 16:18:58 Tom Lane ha scritto:
> > It's either an int8 representing microseconds away from 2000-01-01
> > 00:00:00 UTC, or a float8 representing seconds away from the same
> > origin.
> 
> Does this mean that negative numbers are for timestamps before y2k?

Yes.

> Why and when there is a choice between int8 and float8 representation?

At compilation time, by the --enable-integer-datetimes flag to
configure.  You can find out whether the server you're currently
connected to uses integer datetimes with SHOW integer_datetimes.

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"The only difference is that Saddam would kill you on private, where the
Americans will kill you in public" (Mohammad Saleh, 39, a building contractor)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] System V IPC on Windows

2007-11-08 Thread Lee Keel
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:pgsql-general-
> [EMAIL PROTECTED] On Behalf Of Magnus Hagander
> Sent: Thursday, November 08, 2007 1:30 PM
> To: Kevin Neufeld
> Cc: Tom Lane; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] System V IPC on Windows
> 
> Pg on win32 is 32-bit. It has a total address space of 2Gb, minus the OS
> overhead, minus the code, minus local memory etc. You're just not going
> to fit that much in the address space.
> 
> There are fixes to reduce the memory usage of the postmaster (which is
> likely what runs out first, unless you have a large work_mem) in 8.3,
> but you're still right up against the wall with that large
> shared_buffers. Plus, as I said in my other email, you're likely not
> seeing any performance gain from such a large shared_buffers anyway. If
> you are, you're seeing something new, and we definitely need to find out
> why.
> 
> 
> //Magnus
> 
> 
[Lee Keel] 

I can't do any benchmarks because I keep getting errors.  But I have dropped
this value down and I am not getting the out of memory errors any more.  I
was trying to solve other problems by bumping this way up but it seemed to
just cause more problems.  I found several things in documentation that says
that bumping the shared_buffers\work_mem up to over 1gb was fine.  Is that
strictly for linux boxes?

-LK
This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the sender. This message 
contains confidential information and is intended only for the individual 
named. If you are not the named addressee you should not disseminate, 
distribute or copy this e-mail.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] System V IPC on Windows

2007-11-08 Thread Magnus Hagander
Kevin Neufeld wrote:
> That makes sense, thanx.
> Another individual was having problems adjusting the shared_memory
> settings higher than 1.2GB on a 8GB 64bit machine running Vista. 

Pg on win32 is 32-bit. It has a total address space of 2Gb, minus the OS
overhead, minus the code, minus local memory etc. You're just not going
to fit that much in the address space.

There are fixes to reduce the memory usage of the postmaster (which is
likely what runs out first, unless you have a large work_mem) in 8.3,
but you're still right up against the wall with that large
shared_buffers. Plus, as I said in my other email, you're likely not
seeing any performance gain from such a large shared_buffers anyway. If
you are, you're seeing something new, and we definitely need to find out
why.


//Magnus


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] System V IPC on Windows

2007-11-08 Thread Magnus Hagander
Lee Keel wrote:
>> -Original Message-
>> From: [EMAIL PROTECTED] [mailto:pgsql-general-
>> [EMAIL PROTECTED] On Behalf Of Magnus Hagander
>> Sent: Thursday, November 08, 2007 12:47 AM
>> To: Tom Lane
>> Cc: Kevin Neufeld; pgsql-general@postgresql.org
>> Subject: Re: [GENERAL] System V IPC on Windows
>>
>>  > > Does anyone know how to adjust the IPC settings in Windows?
>>> There aren't any such settings in Windows, AFAIK.
>> Correct. The only real adjustable limit is the size of the Windows
>> pagefile, but that one is normally dynamic. But there must be room for all
>> the shared memory in it. It's not going to be there, but the space is
>> reserved.
>>
>> That said, if you need to increase the pagefile size to accomodate your
>> shared buffers, you likely have way too large value for shared buffers.
>>
>> /Magnus
>>
>>
>> ---(end of broadcast)---
>> TIP 5: don't forget to increase your free space map settings
> [Lee Keel] 
> 
> What if the page file exceeds the shared_buffers, but you can't increase the
> shared buffers to a larger amount?  For example, page file is set to be
> between 500MB and 10GB, but you can't set the shared_buffers to more than
> 1200MB.  If set to 1300MB or higher then service will not start.

Have you measured any performance at all on this? The general
recommendation is to have a *small* shared_buffers on Widnows. >1Gb is
likely way too large - try something much smaller unless you have
benchmarks showing that this is helping you.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] INSERT performance deteriorates quickly during a large import

2007-11-08 Thread Chris Browne
[EMAIL PROTECTED] ("=?UTF-8?B?VG9tw6HFoSBWb25kcmE=?=") writes:
> Try to one of these:
>
> a) don't use INSERT statements, use a COPY instead
>
> b) from time to time run ANALYZE on the "public" table (say 1000
>inserts, then one analyze)
>
> c) create the table without constraints (primary / foreign keys in this
>case), import all the data, and then create the constraints
>
> The (b) and (c) may be combined, i.e. import without constraints and
> analyze from time to time. I'd probably try the (a) at first, anyway.
>
> Try to gather some more statistics - is the problem related to CPU or
> I/O? Use 'dstat' for example - this might say give you a hint in case
> the advices mentioned above don't help.

I agree with all but b).

- If you use COPY, that copies data "in bulk" which is *way* faster
  than submitting individual INSERT statements that must each be parsed.

  So I certainly agree with a).

- There are two prime reasons to expect the data load to slow down:

  1. Because adding entries to the index gets more expensive the
 larger the table gets;

  2. Because searching through foreign key constraints tends to get
 more expensive as the target table grows.

  Those point to doing c).

If you put off evaluating indices and foreign key constraints until
all of the data is loaded, there should be no need to run ANALYZE
during the COPY process.

And there should be no reason for loading data to get more costly as
the size of the table increases.
-- 
let name="cbbrowne" and tld="linuxfinances.info" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/advocacy.html
Rules of the Evil Overlord #116.  "If I capture the hero's starship, I
will keep it in  the landing bay with the ramp down,  only a few token
guards on  duty and a ton  of explosives set to  go off as  soon as it
clears the blast-range." 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] INSERT performance deteriorates quickly during a large import

2007-11-08 Thread Bill Moran
In response to "Krasimir Hristozov \(InterMedia Ltd\)" <[EMAIL PROTECTED]>:

> We need to import data from a relatively large MySQL database into an
> existing PostgreSQL database, using a PHP5 script that SELECTs from MySQL
> and INSERTs in PostgreSQL. A part of the import involves moving about
> 1,300,000 records from one MySQL table to one of our PostgreSQL tables. The
> problem is that the insert performance inevitably deteriorates as the number
> of inserts increases.
> 
> We tried different approaches:
> 
>  * selecting only parts of the source table data based on a certain
> condition
>  * selecting all of the source data
> 
> coupled with either of these:
> 
>  * inserting without explicit transactions
>  * inserting all the data in a single transaction
>  * inserting the data in partial transactions of about 100, 1000, 1,
> 10 inserts each
> 
> While there were performance benefits in some of the cases (selecting all
> the data and inserting in transaction packets of about 1000 each being the
> fastest), the problem was that it still deteriorated as the import
> progressed.
> 
> We tried removing all foreign keys and indices from the postgres table,
> still gained performance, but it deteriorated as well.
> 
> The latest (and best performing) test we did was under the following
> conditions:
> 
>  * 11851 pre-existing records in the destination table
>  * the table was vacuumed just before the import
>  * all foreign keys and indices were removed from the destination table
>  * selected all of the data from the source table at once
>  * inserted in transactions of 1000 inserts each
> 
> We displayed time statistics on each 100 inserts. The process started at
> about 1 second per 100 inserts. This estimated to about 4 hours for the
> entire process. 14 hours later it had imported about a quarter of the data
> (a bit more than 33 records), and 100 inserts now took nearly 40
> seconds.
> 
> We tested reading from MySQL alone, without inserting the data in Postgres.
> All records were read in about a minute and a half, so MySQL performance is
> not a part of the issue. The PHP script selects the MySQL data, fetches rows
> sequentially, occasionally performs a couple of selects against PostgreSQL
> data (which is cached in a PHP array to reduce the DB operations; no more
> than 8 array elements, integer keys, integer data), and inserts into
> PostgreSQL. The algorithm seems to be linear in nature and perfomance
> deterioration most probably doesn't have to do with the PHP code.
> 
> Has anyone had an issue like this, and perhaps a suggestion for a possible
> cause and solution? Is it common for PostgreSQL to grow so slow as the
> amount of data in the tables increases? If so, is it just the insert
> operation or all kinds of queries? Isn't 30 records too low a threshold
> for such performance deterioration?
> 
> Here are some technical details, that might be helpful:
> 
>  * PHP, MySQL and PostgreSQL all work on the same server, sharing the same
> memory and hard drive.

This makes it very difficult to blame PostgreSQL.  If the insert process
is CPU bound, and PHP is using a ton of CPU, then PG will be starved.
You kinda contradict yourself, saying PG is not starved, then saying
that the CPU is maxed out.  In any event, having all three on one machine
will make it more fun to isolate where the actual bottleneck is.

>  * the server runs FreeBSD 5.3-RELEASE on an AMD Athlon(tm) 64 Processor
> 3000+ (2GHz K8 class CPU) with 1GB RAM

This is another problem.  5.3 has the worst performance of any version
of FreeBSD I've ever used.  Even downgrading to 4.11 (not recommended)
would produce a performance improvement, but you you should get this
system to 5.5 (at least) or 6.2 (preferable).

>  * the software versions installed are Apache 1.3.37, PHP 5.2.1, MySQL
> 4.1.22, PostgreSQL 8.1.8
>  * postgresql.conf variables other than defaults are: max_connections = 40,
> shared_buffers = 1000 (this is the default)

1000 shared_buffers is pretty low for any real work.  While your tables
aren't huge, they're big enough to warrant more shared_buffers.  However,
you've only got 1G of RAM on this system to share between two DB servers,
which is going to constrain you a good bit.

There are other settings important to insert performance that you haven't
mentioned.  checkpoint_segments and the like, for example.  You also don't
describe your disk subsystem, it's entirely possible you've filled up the
cache on the disk controllers (which is why it looked initially fast) and
now are hitting up against the max speed the disks can do.  With only 1G
of RAM, it's possible that MySQL is reading, PostgreSQL is writing, and
PHP is swapping.  It doesn't take a lot of disk contention to flush
performance down the toilet.  See what iostat says.  Even better, use
top in "m" mode (hit m after top start) to see how much IO each process
is using (I believe that was added in 5.X, but it may only be available
in 6.X v

Re: [GENERAL] INSERT performance deteriorates quickly during a large import

2007-11-08 Thread Tom Lane
"Krasimir Hristozov \(InterMedia Ltd\)" <[EMAIL PROTECTED]> writes:
> We need to import data from a relatively large MySQL database into an
> existing PostgreSQL database, using a PHP5 script that SELECTs from MySQL
> and INSERTs in PostgreSQL. A part of the import involves moving about
> 1,300,000 records from one MySQL table to one of our PostgreSQL tables. The
> problem is that the insert performance inevitably deteriorates as the number
> of inserts increases.

Are you *certain* you've gotten rid of all the indexes and foreign keys?
A simple insert ought to be pretty much constant-time in Postgres, so it
seems to me that you've missed something.

It also seems possible that you are wrong to disregard PHP as a possible
source of the problem.  Have you tried watching the PHP and PG backend
processes with "top" (or similar tool) to see who's consuming CPU time
and/or memory space?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] what is the date format in binary query results

2007-11-08 Thread Tom Lane
Reg Me Please <[EMAIL PROTECTED]> writes:
> Il Thursday 08 November 2007 17:09:22 Tom Lane ha scritto:
>> configure --enable-integer-datetimes.

> How can I tell which one has been choosen by my distribution (Ubuntu)?

"show integer_datetimes".  For programmatic purposes, try
PQparameterStatus(pgconn, "integer_datetimes").

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] why there is no interval / interval operator?

2007-11-08 Thread Pavel Stehule
one possible implementation
http://www.pgsql.cz/index.php/PL/pgSQL_%28en%29#Usage_of_PL.2FpgSQL_functions_for_designing_own_operators

Pavel

On 08/11/2007, hubert depesz lubaczewski <[EMAIL PROTECTED]> wrote:
> is it just a simple ommission, or am i missing something?
>
> we have interval / float8 ( = interval), so i think that adding interval
> / interval ( = float8) should be possible.
>
> depesz
>
> --
> quicksil1er: "postgres is excellent, but like any DB it requires a
> highly paid DBA.  here's my CV!" :)
> http://www.depesz.com/ - blog dla ciebie (i moje CV)
>
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match
>

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] INSERT performance deteriorates quickly during a large import

2007-11-08 Thread Tomáš Vondra

Try to one of these:

a) don't use INSERT statements, use a COPY instead

b) from time to time run ANALYZE on the "public" table (say 1000
   inserts, then one analyze)

c) create the table without constraints (primary / foreign keys in this
   case), import all the data, and then create the constraints

The (b) and (c) may be combined, i.e. import without constraints and 
analyze from time to time. I'd probably try the (a) at first, anyway.


Try to gather some more statistics - is the problem related to CPU or 
I/O? Use 'dstat' for example - this might say give you a hint in case 
the advices mentioned above don't help.


Tomas


We need to import data from a relatively large MySQL database into an
existing PostgreSQL database, using a PHP5 script that SELECTs from MySQL
and INSERTs in PostgreSQL. A part of the import involves moving about
1,300,000 records from one MySQL table to one of our PostgreSQL tables. The
problem is that the insert performance inevitably deteriorates as the 
number

of inserts increases.

We tried different approaches:

* selecting only parts of the source table data based on a certain
condition
* selecting all of the source data

coupled with either of these:

* inserting without explicit transactions
* inserting all the data in a single transaction
* inserting the data in partial transactions of about 100, 1000, 1,
10 inserts each

While there were performance benefits in some of the cases (selecting all
the data and inserting in transaction packets of about 1000 each being the
fastest), the problem was that it still deteriorated as the import
progressed.

We tried removing all foreign keys and indices from the postgres table,
still gained performance, but it deteriorated as well.

The latest (and best performing) test we did was under the following
conditions:

* 11851 pre-existing records in the destination table
* the table was vacuumed just before the import
* all foreign keys and indices were removed from the destination table
* selected all of the data from the source table at once
* inserted in transactions of 1000 inserts each

We displayed time statistics on each 100 inserts. The process started at
about 1 second per 100 inserts. This estimated to about 4 hours for the
entire process. 14 hours later it had imported about a quarter of the data
(a bit more than 33 records), and 100 inserts now took nearly 40
seconds.

We tested reading from MySQL alone, without inserting the data in Postgres.
All records were read in about a minute and a half, so MySQL performance is
not a part of the issue. The PHP script selects the MySQL data, fetches 
rows

sequentially, occasionally performs a couple of selects against PostgreSQL
data (which is cached in a PHP array to reduce the DB operations; no more
than 8 array elements, integer keys, integer data), and inserts into
PostgreSQL. The algorithm seems to be linear in nature and perfomance
deterioration most probably doesn't have to do with the PHP code.

Has anyone had an issue like this, and perhaps a suggestion for a possible
cause and solution? Is it common for PostgreSQL to grow so slow as the
amount of data in the tables increases? If so, is it just the insert
operation or all kinds of queries? Isn't 30 records too low a threshold
for such performance deterioration?

Here are some technical details, that might be helpful:

* PHP, MySQL and PostgreSQL all work on the same server, sharing the same
memory and hard drive.
* the server runs FreeBSD 5.3-RELEASE on an AMD Athlon(tm) 64 Processor
3000+ (2GHz K8 class CPU) with 1GB RAM
* the software versions installed are Apache 1.3.37, PHP 5.2.1, MySQL
4.1.22, PostgreSQL 8.1.8
* postgresql.conf variables other than defaults are: max_connections = 40,
shared_buffers = 1000 (this is the default)
* we have also tried these on another server with Red Hat Enterprise Linux
ES release 4 (Linux 2.6.9-42.0.3.ELsmp) on 2xDual Core AMD Opteron(tm)
Processor 270 (4x2GHz logical CPUs) with 2GB RAM
* both servers run in x86_64 mode, PostgreSQL footprint in memory stays
relatively small, CPU usage maxes out on import, there is no resource
starvation in any way

DDL statement for the creation of the PostgreSQL table in question:

CREATE TABLE "public"."sp_thread_replies" (
  "id" SERIAL,
  "thread_id" INTEGER NOT NULL,
  "body" TEXT NOT NULL,
  "ts_added" INTEGER DEFAULT 0 NOT NULL,
  "user_id" INTEGER NOT NULL,
  "thread_offset" INTEGER DEFAULT 0,
  "approved" SMALLINT DEFAULT 1,
  "title" TEXT,
  "deleted" SMALLINT DEFAULT 0,
  "edit_reason" VARCHAR(255),
  "edit_user_id" INTEGER,
  "edit_time" INTEGER,
  CONSTRAINT "sp_thread_replies_pkey" PRIMARY KEY("id"),
  CONSTRAINT "sp_thread_replies_threads_fk" FOREIGN KEY ("thread_id")
REFERENCES "public"."sp_threads"("id")
ON DELETE CASCADE
ON UPDATE NO ACTION
NOT DEFERRABLE,
  CONSTRAINT "sp_thread_replies_users_fk" FOREIGN KEY ("user_id")
REFERENCES "public"."sp_users"("id")
ON DELETE NO ACTION
ON UPDATE

[GENERAL] Optimal time series sampling.

2007-11-08 Thread Ted Byers
As a prelude to where I really want to go, please
consider the following SELECT statement.

  SELECT close_price FROM stockprices A
  WHERE price_date =
 (SELECT MAX(price_date) FROM stockprices B
  WHERE A.stock_id = B.stock_id AND A.stock_id =
id);

stockprices has a primary key comprised of stock_id
and price_date, and I tried the same query with an
extra inex on price_date (but that index made no
difference in apparent performance as seen on the
clock on the wall).

I have been advised (on the MySQL board), to use the
following (with the claim, unsupported as far as I can
tell, that it is both correct and much faster - it
appears to be correct, but it is certainly no faster):

  SELECT A.`close` AS close_price
  FROM stockprices A LEFT JOIN stockprices B
ON A.stock_id = B.stock_id
  AND B.price_date > A.price_date
  WHERE B.price_date IS NULL
  AND A.stock_id = id;

It appears to do the right thing.  I certainly get the
right answer, but I am not seeing a significant
difference in performance.  Worse, when I invoke
something like it for a suite of about two dozen
stocks, it takes about ten minutes to complete.  (I
may try a variant in which the last clause used in
WHERE is replaced by IN followed by a trivial select
that gets the same two dozen stock_ids, to see if that
helps.)

Now, I am concerned with performance because, and this
is where I really want to go, I want to adapt this
logic to create new time series of closing prices, but
at the granularity of a week, a month or quarter, and
there is no predicting a priori how long the series
is.  IBM's data goes back decades while I have data
for other stocks that go back only a couple years.

Now, a junior programmer here had suggested just doing
a simple select, at least for weekly granularity, by
selecting a value if it's day of the week computes to
Friday.  That can't work correctly because in some
weeks, there are statutory holidays that land on
Fridays, resulting in the last actual trading day for
that week being Thursday.  His simple approach
guarantees that many records that ought to be included
will be ignored.  I need a more dynamic and flexible
approach which allows me to work on the basis that I
have prices for all trading days for a given stock
from the time my data for it begins.  So I need a more
complex select statement that will just select the
most recent price for a given stock for each week (or
month or quarter or year).

Now, I can get the full time series for two dozen
stocks, as slow and brain dead as doing a select for
each stock ID, AND have my Java code construct and
display a chart, in less than 20 seconds (and Java
does not have a reputation for being fast).  I need
whatever solution I use to be that quick.

Any thoughts about how best to attack this in order to
get the correct results as fast as is possible?  What
options would you consider, WRT defining the SQL
statements you would benchmark, in order to design
your benchmark testing?

Thanks,

Ted

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Calculation for Max_FSM_pages : Any rules of thumb?

2007-11-08 Thread Bill Moran
In response to Vivek Khera <[EMAIL PROTECTED]>:

> 
> On Nov 1, 2007, at 8:51 PM, Ow Mun Heng wrote:
> 
> > Another question is, based on what I've read in the archives (in my
> > laptop.. No-Inet conn @ work) Since I've overran my max_FSM, I'm
> > basically screwed and will have to do a vacuum verbose FULL on the
> > entire DB. Crap..
> 
> I've seen this repeated many times as well, and I can't think of a  
> really good reason why this should be true.

It's not inherently true, it's just likely.

> Once you increase max fsm  
> pages, won't the very next regular vacuum find all the free space in  
> pages and add them to the map anyway?

Yes.

> Ie, you've not "lost" any free  
> space once the next regular vacuum runs.  At worst, you've got a  
> slightly bloated table because you allocated more pages rather than re- 
> using some, but is that worth a full vacuum?

The situation you just described is the reason I recommend a full
vacuum after such a situation has occurred.  No, it's not required
in all cases, but it's a lot easier to recommend than the research
required to determine whether or not your table bloat is excessive
enough to warrant it.

If you can make the time to do the full vacuum, it's probably worth
it, just for peace of mind.  If it's difficult to schedule a full
vacuum, then you need to carefully review various page usages to
see if any individual tables are worth it and/or all kinds of careful
consideration.  As a result, I recommend a full vacuum, and if the
person complains that they can't schedule it, _then_ I go into the
details of how to figure out what else can/should be done.

So I guess I'm recommending it to make my own life easier :)

> I don't think it will be  
> unless you're *way* under the fsm pages needed and have been for a  
> long time.

Frequently, when people ask for help because they've exceed max_fsm*,
it's because they're not paying attention to their systems, and therefore
the problem has been occurring for a while before it got so bad that
they couldn't ignore it.  As a result, a full vacuum is frequently a
necessity.

Folks who are monitoring their databases closely don't hit this
problem nearly as often.

-- 
Bill Moran
http://www.potentialtech.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[GENERAL] INSERT performance deteriorates quickly during a large import

2007-11-08 Thread Krasimir Hristozov (InterMedia Ltd)

We need to import data from a relatively large MySQL database into an
existing PostgreSQL database, using a PHP5 script that SELECTs from MySQL
and INSERTs in PostgreSQL. A part of the import involves moving about
1,300,000 records from one MySQL table to one of our PostgreSQL tables. The
problem is that the insert performance inevitably deteriorates as the number
of inserts increases.

We tried different approaches:

* selecting only parts of the source table data based on a certain
condition
* selecting all of the source data

coupled with either of these:

* inserting without explicit transactions
* inserting all the data in a single transaction
* inserting the data in partial transactions of about 100, 1000, 1,
10 inserts each

While there were performance benefits in some of the cases (selecting all
the data and inserting in transaction packets of about 1000 each being the
fastest), the problem was that it still deteriorated as the import
progressed.

We tried removing all foreign keys and indices from the postgres table,
still gained performance, but it deteriorated as well.

The latest (and best performing) test we did was under the following
conditions:

* 11851 pre-existing records in the destination table
* the table was vacuumed just before the import
* all foreign keys and indices were removed from the destination table
* selected all of the data from the source table at once
* inserted in transactions of 1000 inserts each

We displayed time statistics on each 100 inserts. The process started at
about 1 second per 100 inserts. This estimated to about 4 hours for the
entire process. 14 hours later it had imported about a quarter of the data
(a bit more than 33 records), and 100 inserts now took nearly 40
seconds.

We tested reading from MySQL alone, without inserting the data in Postgres.
All records were read in about a minute and a half, so MySQL performance is
not a part of the issue. The PHP script selects the MySQL data, fetches rows
sequentially, occasionally performs a couple of selects against PostgreSQL
data (which is cached in a PHP array to reduce the DB operations; no more
than 8 array elements, integer keys, integer data), and inserts into
PostgreSQL. The algorithm seems to be linear in nature and perfomance
deterioration most probably doesn't have to do with the PHP code.

Has anyone had an issue like this, and perhaps a suggestion for a possible
cause and solution? Is it common for PostgreSQL to grow so slow as the
amount of data in the tables increases? If so, is it just the insert
operation or all kinds of queries? Isn't 30 records too low a threshold
for such performance deterioration?

Here are some technical details, that might be helpful:

* PHP, MySQL and PostgreSQL all work on the same server, sharing the same
memory and hard drive.
* the server runs FreeBSD 5.3-RELEASE on an AMD Athlon(tm) 64 Processor
3000+ (2GHz K8 class CPU) with 1GB RAM
* the software versions installed are Apache 1.3.37, PHP 5.2.1, MySQL
4.1.22, PostgreSQL 8.1.8
* postgresql.conf variables other than defaults are: max_connections = 40,
shared_buffers = 1000 (this is the default)
* we have also tried these on another server with Red Hat Enterprise Linux
ES release 4 (Linux 2.6.9-42.0.3.ELsmp) on 2xDual Core AMD Opteron(tm)
Processor 270 (4x2GHz logical CPUs) with 2GB RAM
* both servers run in x86_64 mode, PostgreSQL footprint in memory stays
relatively small, CPU usage maxes out on import, there is no resource
starvation in any way

DDL statement for the creation of the PostgreSQL table in question:

CREATE TABLE "public"."sp_thread_replies" (
  "id" SERIAL,
  "thread_id" INTEGER NOT NULL,
  "body" TEXT NOT NULL,
  "ts_added" INTEGER DEFAULT 0 NOT NULL,
  "user_id" INTEGER NOT NULL,
  "thread_offset" INTEGER DEFAULT 0,
  "approved" SMALLINT DEFAULT 1,
  "title" TEXT,
  "deleted" SMALLINT DEFAULT 0,
  "edit_reason" VARCHAR(255),
  "edit_user_id" INTEGER,
  "edit_time" INTEGER,
  CONSTRAINT "sp_thread_replies_pkey" PRIMARY KEY("id"),
  CONSTRAINT "sp_thread_replies_threads_fk" FOREIGN KEY ("thread_id")
REFERENCES "public"."sp_threads"("id")
ON DELETE CASCADE
ON UPDATE NO ACTION
NOT DEFERRABLE,
  CONSTRAINT "sp_thread_replies_users_fk" FOREIGN KEY ("user_id")
REFERENCES "public"."sp_users"("id")
ON DELETE NO ACTION
ON UPDATE NO ACTION
NOT DEFERRABLE
) WITH OIDS;

The table is a part of a custom forum engine. It stores all thread posts.
It's most often queried with SELECTs and INSERTSs, less often with UPDATEs,
and records are deleted quite seldom in normal operation of the application
(though we may delete records manually from the console from time to time). 



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] "Resurrected" data files - problem?

2007-11-08 Thread Albe Laurenz
Tom Lane wrote:
>>> So if we perform our database backups with incremental
>>> backups as described above, we could end up with additional
>>> files after the restore, because PostgreSQL files can get
>>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE).
>>> 
>>> Could such "resurrected" files (data files, files in
>>> pg_xlog, pg_clog or elsewhere) cause a problem for the database
>>> (other than the obvious one that there may be unnecessary files
>>> about that consume disk space)?
>> 
>> This will not work at all.
> 
> To be more specific: the resurrected files aren't the problem;
> offhand I see no reason they'd create any issue beyond wasted
> disk space.  The problem is version skew between files that were
> backed up at slightly different times, leading to inconsistency.

I should have mentioned that before the (incremental) backup
there would be a pg_start_backup() and a pg_stop_backup()
afterwards, and we would use PITR.

So there could only be three kinds of files:
- Files that did not change since the full backup, restored
  from there. They should therefore look exactly as if the
  online backup were performed in the normal way.
- Files that have changed or are new, restored from the
  incremental backup. These will also be ok, because
  they were backed up between pg_start_backup() and
  pg_stop_backup().
- Files that have been deleted between full and incremental
  backup and have been resurrected.

This third group is the only one which might be problematic,
as far as I can see, because PostgreSQL will no expect them to
be there.

The version skew between files backed up at slightly different
times should be taken care of by PITR, shouldn't it?

Yours,
Laurenz Albe

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] System V IPC on Windows

2007-11-08 Thread Kevin Neufeld



 > > Does anyone know how to adjust the IPC settings in Windows?


There aren't any such settings in Windows, AFAIK.
  

Correct. The only real adjustable limit is the size of the Windows
pagefile, but that one is normally dynamic. 

[Lee Keel] 


What if the page file exceeds the shared_buffers, but you can't increase the
shared buffers to a larger amount?  For example, page file is set to be
between 500MB and 10GB, but you can't set the shared_buffers to more than
1200MB.  If set to 1300MB or higher then service will not start.

Please see "GEOS union() error" thread for more details.

Thanks,
Lee
  
(In the postgis-users list 
http://postgis.refractions.net/pipermail/postgis-users/2007-November/017616.html)


-- Kevin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] what is the date format in binary query results

2007-11-08 Thread Tom Lane
Reg Me Please <[EMAIL PROTECTED]> writes:
> Il Thursday 08 November 2007 16:18:58 Tom Lane ha scritto:
>> It's either an int8 representing microseconds away from 2000-01-01
>> 00:00:00 UTC, or a float8 representing seconds away from the same
>> origin.

> Does this mean that negative numbers are for timestamps before y2k?

Right.

> Why and when there is a choice between int8 and float8 representation?

configure --enable-integer-datetimes.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] "Resurrected" data files - problem?

2007-11-08 Thread Alan Hodgson
On Thursday 08 November 2007, "Albe Laurenz" <[EMAIL PROTECTED]> 
wrote:
> Can you give me a good reason why?
>
> > Try re-reading the instructions on backup in the manual.
>
> I know them well. That is why I ask if this questionable procedure
> could lead to damage.

You cannot backup a live database with a filesystem backup and expect it to 
work afterwards, unless you take all the steps required to produce a PITR 
base backup.

It's not even "questionable". It's a fundamental misunderstanding of what is 
required to backup a database.

-- 
Peak Oil is now accepted as inevitable, and the debate only becomes as 
to when - James Schlesinger, former US Secretary of Energy 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] System V IPC on Windows

2007-11-08 Thread Kevin Neufeld
That makes sense, thanx. 

Another individual was having problems adjusting the shared_memory 
settings higher than 1.2GB on a 8GB 64bit machine running Vista.  
Whenever he would adjust higher than that, the postgresql service 
wouldn't start throwing some kind of error.  In linux, one would simple 
adjust the SHMMAX settings.  If there's no such setting in Windows, then 
the problem must lie somewhere else.


On a side, the docs could be a little more clear on this.  
(http://www.postgresql.org/docs/8.2/static/kernel-resources.html)
The only reference to Windows says "(For the Windows port, PostgreSQL 
provides its own replacement implementation of these facilities)".


Cheers,
Kevin

Magnus Hagander wrote:

 > > Does anyone know how to adjust the IPC settings in Windows?
  
There aren't any such settings in Windows, AFAIK. 



Correct. The only real adjustable limit is the size of the Windows pagefile, 
but that one is normally dynamic. But there must be room for all the shared 
memory in it. It's not going to be there, but the space is reserved.

That said, if you need to increase the pagefile size to accomodate your shared 
buffers, you likely have way too large value for shared buffers.

/Magnus
  


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] "Resurrected" data files - problem?

2007-11-08 Thread Scott Marlowe
On Nov 8, 2007 9:39 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Peter Childs" <[EMAIL PROTECTED]> writes:
> > On 08/11/2007, Albe Laurenz <[EMAIL PROTECTED]> wrote:
> >> So if we perform our database backups with incremental
> >> backups as described above, we could end up with additional
> >> files after the restore, because PostgreSQL files can get
> >> deleted (e.g. during DROP TABLE or TRUNCATE TABLE).
> >>
> >> Could such "resurrected" files (data files, files in
> >> pg_xlog, pg_clog or elsewhere) cause a problem for the database
> >> (other than the obvious one that there may be unnecessary files
> >> about that consume disk space)?
>
> > This will not work at all.
>
> To be more specific: the resurrected files aren't the problem;
> offhand I see no reason they'd create any issue beyond wasted
> disk space.  The problem is version skew between files that were
> backed up at slightly different times, leading to inconsistency.
>
> You could make this work if you shut down Postgres whenever you
> are taking a backup, but as a means for backing up a live database
> it indeed won't work at all.

I think if you had real snapshotting file systems you could use the
snapshots to create your backups.  But this seems like a lot of work
to avoid implementing PITR to me.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] what is the date format in binary query results

2007-11-08 Thread Reg Me Please
Il Thursday 08 November 2007 17:09:22 Tom Lane ha scritto:
> Reg Me Please <[EMAIL PROTECTED]> writes:
> > Il Thursday 08 November 2007 16:18:58 Tom Lane ha scritto:
> >> It's either an int8 representing microseconds away from 2000-01-01
> >> 00:00:00 UTC, or a float8 representing seconds away from the same
> >> origin.
> >
> > Does this mean that negative numbers are for timestamps before y2k?
>
> Right.
>
> > Why and when there is a choice between int8 and float8 representation?
>
> configure --enable-integer-datetimes.

Wow: it's at compile time!

How can I tell which one has been choosen by my distribution (Ubuntu)?

-- 
Reg me Please

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Strange variable behaviour when using it in limit clause in plpgsql stored procedure

2007-11-08 Thread Tom Lane
"Sergey Moroz" <[EMAIL PROTECTED]> writes:
> I tested performance of my query with limit clause inside plpgsql procedure.
> 2 slightly different situations:

> 1. Sql with limit clause and literal variable (for example 'select field1
> from table1 limit 100')
> 2. The same sql with limit clause and pgplsql variable  (for example 'select
> field1 from table1 limit vilimit'). vilimit defined in declare section.

> At first I compared execution plans. they were absolutely equal!
> But in fact first procedure was 10 times faster then the second!

Exactly what did you do to conclude that the execution plans were the
same?  I would not expect the planner to choose the same plan in these
two cases, at least not if 100 is just a small fraction of the total
estimated query output.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] System V IPC on Windows

2007-11-08 Thread Lee Keel
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:pgsql-general-
> [EMAIL PROTECTED] On Behalf Of Magnus Hagander
> Sent: Thursday, November 08, 2007 12:47 AM
> To: Tom Lane
> Cc: Kevin Neufeld; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] System V IPC on Windows
> 
>  > > Does anyone know how to adjust the IPC settings in Windows?
> >
> > There aren't any such settings in Windows, AFAIK.
> 
> Correct. The only real adjustable limit is the size of the Windows
> pagefile, but that one is normally dynamic. But there must be room for all
> the shared memory in it. It's not going to be there, but the space is
> reserved.
> 
> That said, if you need to increase the pagefile size to accomodate your
> shared buffers, you likely have way too large value for shared buffers.
> 
> /Magnus
> 
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings
[Lee Keel] 

What if the page file exceeds the shared_buffers, but you can't increase the
shared buffers to a larger amount?  For example, page file is set to be
between 500MB and 10GB, but you can't set the shared_buffers to more than
1200MB.  If set to 1300MB or higher then service will not start.

Please see "GEOS union() error" thread for more details.

Thanks,
Lee
This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the sender. This message 
contains confidential information and is intended only for the individual 
named. If you are not the named addressee you should not disseminate, 
distribute or copy this e-mail.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] why there is no interval / interval operator?

2007-11-08 Thread hubert depesz lubaczewski
On Thu, Nov 08, 2007 at 10:50:39AM -0500, Tom Lane wrote:
> hubert depesz lubaczewski <[EMAIL PROTECTED]> writes:
> > we have interval / float8 ( = interval), so i think that adding interval
> > / interval ( = float8) should be possible.
> What would you define it to mean, keeping in mind that an interval
> has three components not one?

ah. so i did overlook something :)

what 3 components, and what are the edge cases that it has to handle?

depesz

-- 
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] what is the date format in binary query results

2007-11-08 Thread Reg Me Please
Il Thursday 08 November 2007 16:18:58 Tom Lane ha scritto:
> It's either an int8 representing microseconds away from 2000-01-01
> 00:00:00 UTC, or a float8 representing seconds away from the same
> origin.

Does this mean that negative numbers are for timestamps before y2k?

Why and when there is a choice between int8 and float8 representation?


-- 
Reg me Please

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Calculation for Max_FSM_pages : Any rules of thumb?

2007-11-08 Thread Vivek Khera


On Nov 1, 2007, at 8:51 PM, Ow Mun Heng wrote:


Another question is, based on what I've read in the archives (in my
laptop.. No-Inet conn @ work) Since I've overran my max_FSM, I'm
basically screwed and will have to do a vacuum verbose FULL on the
entire DB. Crap..


I've seen this repeated many times as well, and I can't think of a  
really good reason why this should be true.  Once you increase max fsm  
pages, won't the very next regular vacuum find all the free space in  
pages and add them to the map anyway?  Ie, you've not "lost" any free  
space once the next regular vacuum runs.  At worst, you've got a  
slightly bloated table because you allocated more pages rather than re- 
using some, but is that worth a full vacuum?  I don't think it will be  
unless you're *way* under the fsm pages needed and have been for a  
long time.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] why there is no interval / interval operator?

2007-11-08 Thread Tom Lane
hubert depesz lubaczewski <[EMAIL PROTECTED]> writes:
> we have interval / float8 ( = interval), so i think that adding interval
> / interval ( = float8) should be possible.

What would you define it to mean, keeping in mind that an interval
has three components not one?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] "Resurrected" data files - problem?

2007-11-08 Thread Albe Laurenz
Peter Childs wrote:
>> We use a tape backup software that does "incremental backups"
[...]
>> So if we perform our database backups with incremental
>> backups as described above, we could end up with additional
>> files after the restore, because PostgreSQL files can get
>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE). 
>> 
>> My question is:
>>  
>> Could such "resurrected" files (data files, files in
>> pg_xlog, pg_clog or elsewhere) cause a problem for the database
>> (other than the obvious one that there may be unnecessary files 
>> about that consume disk space)?
> 
> This will not work at all.

Can you give me a good reason why?

> Try re-reading the instructions on backup in the manual.

I know them well. That is why I ask if this questionable procedure
could lead to damage.

Yours,
Laurenz Albe

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] "Resurrected" data files - problem?

2007-11-08 Thread Tom Lane
"Peter Childs" <[EMAIL PROTECTED]> writes:
> On 08/11/2007, Albe Laurenz <[EMAIL PROTECTED]> wrote:
>> So if we perform our database backups with incremental
>> backups as described above, we could end up with additional
>> files after the restore, because PostgreSQL files can get
>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE).
>> 
>> Could such "resurrected" files (data files, files in
>> pg_xlog, pg_clog or elsewhere) cause a problem for the database
>> (other than the obvious one that there may be unnecessary files
>> about that consume disk space)?

> This will not work at all.

To be more specific: the resurrected files aren't the problem;
offhand I see no reason they'd create any issue beyond wasted
disk space.  The problem is version skew between files that were
backed up at slightly different times, leading to inconsistency.

You could make this work if you shut down Postgres whenever you
are taking a backup, but as a means for backing up a live database
it indeed won't work at all.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] what is the date format in binary query results

2007-11-08 Thread Tom Lane
Samantha Atkins <[EMAIL PROTECTED]> writes:
> What can I expect for a date format from a PGresult containing binary  
> results?  Specifically the Oid type is TIMESTAMPTZOID.

It's either an int8 representing microseconds away from 2000-01-01
00:00:00 UTC, or a float8 representing seconds away from the same
origin.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


[GENERAL] why there is no interval / interval operator?

2007-11-08 Thread hubert depesz lubaczewski
is it just a simple ommission, or am i missing something?

we have interval / float8 ( = interval), so i think that adding interval
/ interval ( = float8) should be possible.

depesz

-- 
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] "Resurrected" data files - problem?

2007-11-08 Thread Peter Childs
On 08/11/2007, Albe Laurenz <[EMAIL PROTECTED]> wrote:
>
> We use a tape backup software that does "incremental backups"
> as follows:
>
> - In a full backup, all files are backed up.
> - In an incremental backup, only the files with modification
>   date after the last backup are backed up.
>
> Now when such a backup is restored, you first have to restore
> the full backup, and then the incremental backup.
>
> The problem is that files which were deleted between the full
> and the incremental backup will get "resurrected" after such a
> restore.
>
> So if we perform our database backups with incremental
> backups as described above, we could end up with additional
> files after the restore, because PostgreSQL files can get
> deleted (e.g. during DROP TABLE or TRUNCATE TABLE).
>
> My question is:
>
> Could such "resurrected" files (data files, files in
> pg_xlog, pg_clog or elsewhere) cause a problem for the database
> (other than the obvious one that there may be unnecessary files
> about that consume disk space)?
>
>
This will not work at all.

Try re-reading the instructions on backup in the manual.

oh and always, always, always test your backup works before you actually
need it!

Peter Childs


[GENERAL] Strange variable behaviour when using it in limit clause in plpgsql stored procedure

2007-11-08 Thread Sergey Moroz
I tested performance of my query with limit clause inside plpgsql procedure.
2 slightly different situations:

1. Sql with limit clause and literal variable (for example 'select field1
from table1 limit 100')
2. The same sql with limit clause and pgplsql variable  (for example 'select
field1 from table1 limit vilimit'). vilimit defined in declare section.

At first I compared execution plans. they were absolutely equal!
But in fact first procedure was 10 times faster then the second! What's
the problem?!?!
Note: tested sql was complex enough. I didn't test this case on simple query
like 'select field1 from table1 limit 100'.

-- 
Sincerely,
Sergey Moroz


[GENERAL] "Resurrected" data files - problem?

2007-11-08 Thread Albe Laurenz
We use a tape backup software that does "incremental backups"
as follows:

- In a full backup, all files are backed up.
- In an incremental backup, only the files with modification
  date after the last backup are backed up.

Now when such a backup is restored, you first have to restore
the full backup, and then the incremental backup.

The problem is that files which were deleted between the full
and the incremental backup will get "resurrected" after such a
restore.

So if we perform our database backups with incremental
backups as described above, we could end up with additional
files after the restore, because PostgreSQL files can get
deleted (e.g. during DROP TABLE or TRUNCATE TABLE).

My question is:

Could such "resurrected" files (data files, files in
pg_xlog, pg_clog or elsewhere) cause a problem for the database
(other than the obvious one that there may be unnecessary files
about that consume disk space)?

Yours,
Laurenz Albe

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] any way for ORDER BY x to imply NULLS FIRST in 8.3?

2007-11-08 Thread Jorge Godoy
Em Wednesday 07 November 2007 13:54:32 rihad escreveu:
>
> May I, as an outsider, comment? :) I really think of ASC NULLS FIRST
> (and DESC NULLS LAST) as the way to go. Imagine a last_login column that
> sorts users that have not logged in as the most recently logged in,
> which is not very intuitive. I vote for sort_nulls_first defaulting to
> false in order not to break bc.

But then, when ordering by login date, you should use COALESCE and infinity 
for them 
(http://www.postgresql.org/docs/8.2/interactive/datatype-datetime.html).



-- 
Jorge Godoy  <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq