Re: [HACKERS] [COMMITTERS] pgsql: Stamp major release 8.3.0,

2007-01-11 Thread Joachim Wieland
On Thu, Jan 11, 2007 at 08:41:24AM +0100, Michael Meskes wrote:
  While I'm whining ... we really need some support in the ecpg regression
  tests for platform-specific diffs, so that the consistent ECPG-check
  failures in buildfarm can go away.

 Hmm, I thought there was. Joachim, could you comment?

There are, see for example
ecpg/test/expected/compat_informix-dec_test-MinGW32.stdout

AFAIK there were no other platforms except for MinGW that need special
treatment.

On the buildfarm we have ecpg failures right now on:

 - osprey
 - guppy
 - clownfish

osprey just seems to be out of diskspace.

On guppy the ecpg checks trigger the OpenBSD bug that Michael and Stefan
identfied here:
http://archives.postgresql.org/pgsql-hackers/2006-09/msg00593.php
Not sure what to do about it, we could diff it away to get it green but it
would not solve the problem. What do you think? I will ask the maintainer of
the box if he intends to update the operating system. If not, I'd propose to
diff it away for the time being.

Clownfish is the Solaris box where Stefan reported segmentation faults only
some days ago. We need to look into this one.



Joachim



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] -f output file option for pg_dumpall

2007-01-11 Thread Dave Page
Possibly, to merge the two programs. I'm intending to put some time into
the append and seperating globals items, but I don't think I have the
time to merge the apps given Tom's concerns and some further investigation.

Regards, Dave.

Bruce Momjian wrote:
 Is there a TODO here?
 
 ---
 
 Dave Page wrote:
 Dave Page wrote:
 I don't object to it in principle, but I think a bit more thought is
 needed as to what's the goal.  A stupid append option would be enough
 for pg_dumpall's current capabilities (ie, text output only) --- but is
 it reasonable to consider generalizing -Fc and -Ft modes to deal with
 multiple databases, and if so how would that need to change pg_dump's
 API?  (I'm not at all sure this is feasible, but let's think about it
 before plastering warts onto pg_dump, not after.)
 Hmm, OK. I'll need to have a good look at the code before I can even
 think about commenting on that, which will have to wait until after I've
 finished bundling releases.
 And having done so, I agree that it's not really feasible without
 significant effort to allow each archive format to be closed and
 re-opened between multiple instances of pg_dump and pg_dumpall, as well
 as to allow them to support multiple databases and global objects
 (though they can effectively live in the default DB of course) within a
 single archive. I'm fairly certain it would be easier to merge the two
 programs as originally suggested, though that does indeed look trickier
 (and more dangerous) than I originally envisaged.

 How about adding the append option, but leaving it undocumented. That
 way if anyone gets the itch to do a full rewrite in the future we
 haven't necessarily got to continue to support an option we no longer want?

 Regards, Dave.


 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at

 http://www.postgresql.org/about/donate
 


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [COMMITTERS] pgsql: Stamp major release 8.3.0,

2007-01-11 Thread Michael Meskes
On Thu, Jan 11, 2007 at 09:51:11AM +0100, Joachim Wieland wrote:
 There are, see for example
 ecpg/test/expected/compat_informix-dec_test-MinGW32.stdout
 
 AFAIK there were no other platforms except for MinGW that need special
 treatment.

Talking about MinGW, do all MinGW systems return:

Connection refused (0x274D/10061)

if the connection is refused, or do the numbers differ?

Or in other words, do we need the sed call in pg_regress.sh or could we
do this with a arch specific expected file too?

Michael

-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] SPAR Simple PostgreSQL AddOn Replication System

2007-01-11 Thread Gurjeet Singh

Hi Johnny,

   I must say, I was really fascinated by this. This is almost a
multi-master replication, although with a lot of grey areas. I had re

On 12/21/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:


 Couldnt find a replication system that worked and did what I wanted, so I
made one.

If you would like to give my humble creation a try...

http://spar.orgfree.com/index.html
 Regards
Johnny





--
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | yahoo }.com


Re: [HACKERS] share info between backends

2007-01-11 Thread Martijn van Oosterhout
On Thu, Jan 11, 2007 at 12:58:01AM -0500, Jaime Casanova wrote:
 Hi,
 
 i'm trying to share some info between backends but i cannot figure how
 to do it...
 i was reading bgwriter.c and autovacuum.c hoping there is something
 that can show me the way to go...

Well, there's a shared memory block, but getting access to it is not
easy. Using a table may to easier, but you'll need to provide more
details if you want more help.

Have a ncie day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] SPAR Simple PostgreSQL AddOn Replication System

2007-01-11 Thread Gurjeet Singh

On 12/21/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:


 If you would like to give my humble creation a try...

http://spar.orgfree.com/index.html



Hi Johnny,

   I must say, I was really fascinated by the idea. This is almost a
multi-master replication, although with a lot of grey areas. I had read the
docs a few days ago but did not test it. Yesterday, gave it a try; but I am
sorry to say it is not working. Here's what I did to test it:

I am on Windows, using MinGW to compile and install postgres.
I compiled the sources for Postgres REL8_2_STABLE.
Installed Tomcat (apache-tomcat-6.0.7.exe) (and it is using java from the
following location on my machine: C:\Program
Files\Java\jre1.5.0_10\bin\client\jvm.dll).
I placed the spar.war file in the tomcat's webapp directory.

Now, from the command line I fired the following to set up (identical)
databases:

# create DATA directories
initdb -D /d/data1
initdb -D /d/data2
initdb -D /d/data3

# startup these database clusters on different ports
pg_ctl start -D /d/data1/ -o -p 6543 -l /d/data1/server.log
pg_ctl start -D /d/data2/ -o -p 5432 -l /d/data2/server.log
pg_ctl start -D /d/data3/ -o -p 4321 -l /d/data3/server.log

# create 'test' database on each of these clusters
createdb -p 6543 test
createdb -p 5432 test
createdb -p 4321 test

# create 'test' tables in each of the cluster's 'test' database
psql.exe -p 6543 test -c create table test( a int primary key, b int )
psql.exe -p 5432 test -c create table test( a int primary key, b int )
psql.exe -p 4321 test -c create table test( a int primary key, b int )

# insert a sample row in each of them
psql.exe -p 6543 test -c insert into test values( 1, 1 )
psql.exe -p 5432 test -c insert into test values( 1, 1 )
psql.exe -p 4321 test -c insert into test values( 1, 1 )

# verify that we have identical data on each database
psql.exe -p 6543 test -c select * from test
psql.exe -p 5432 test -c select * from test
psql.exe -p 4321 test -c select * from test

Now, I configured the SPAR as follows:
Pointed the browser (firefox) to http://localhost:8080/spar/start
On the 'Configuration Set' page I made following 3 entries:

jdbc:postgresql://localhost:6543/test
jdbc:postgresql://localhost:5432/test
jdbc:postgresql://localhost:4321/test

Against each of these entries, I entered my OS user name, left the passwd
field blank, and checked the 'M' check-box (monitored).

I clicked on the 'Configure dBs' button.
I clicked on the 'Test dB connections' button. (the three entries were
highlighted in green and the rest of the rows in red).
I clicked on the 'Add scripts to DB' button. (8 table (starting with _) and
some procedures were created on each of the databases).
Then, on 'Server Control' page, I clicked on 'Start' button.

I guess this is all I need to do, to set up the replication. This basically
is a setup similar to the one mentioned under your 'single replication
server' Archetype. Please point out if I missed something.

Now that the replication was setup, I wanted to test by inserting some data
in one DB and monitor others to see if it shows up there. So I executed the
following:

psql.exe -p 6543 test -c insert into test values( 2, 2 )

And even after waiting for some time, nothing happened on the other
databases! I checked for errors on the 'Server Control' page (I did press
Refresh button). I also ran the following to see if any errors occurred:

psql.exe -p 6543 test -c select * from _errors
psql.exe -p 5432 test -c select * from _errors
psql.exe -p 4321 test -c select * from _errors

but nothing!!! I tried with tables with just one integer columns without the
primary key, even that didn't work. Can you please see if I made some
mistake?

Also, the sources are not available. Although you mention that the software
is free, you should consider associating some license with it; just stick a
license file in you package. I am no expert here, but I guess BSD would be
appropriate for the users to be able to use it without any complications,
since Postgres is also under BSD.

Regards,

--
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | yahoo }.com


Re: [HACKERS] [COMMITTERS] pgsql: Stamp major release 8.3.0,

2007-01-11 Thread Magnus Hagander
On Thu, Jan 11, 2007 at 10:49:59AM +0100, Michael Meskes wrote:
 On Thu, Jan 11, 2007 at 09:51:11AM +0100, Joachim Wieland wrote:
  There are, see for example
  ecpg/test/expected/compat_informix-dec_test-MinGW32.stdout
  
  AFAIK there were no other platforms except for MinGW that need special
  treatment.
 
 Talking about MinGW, do all MinGW systems return:
 
 Connection refused (0x274D/10061)
 
 if the connection is refused, or do the numbers differ?

It shuold be the same - 10061 is the win32 error code. 274D is just the
hex version of the same one.

 Or in other words, do we need the sed call in pg_regress.sh or could we
 do this with a arch specific expected file too?

Can't comment on that one, since I just noticed it existed. How similar
was this one to the standard regression tests? Those were moved into a
C executable so they'd run on a Windows system without a shell, could
the same be done relatively easilyi with this one?

(Obviously we can't run the ecpg regression tests on msvc builds now -
oops, didn't know those had their own script)

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] share info between backends

2007-01-11 Thread Jaime Casanova

On 1/11/07, Martijn van Oosterhout kleptog@svana.org wrote:

On Thu, Jan 11, 2007 at 12:58:01AM -0500, Jaime Casanova wrote:
 Hi,

 i'm trying to share some info between backends but i cannot figure how
 to do it...
 i was reading bgwriter.c and autovacuum.c hoping there is something
 that can show me the way to go...

Well, there's a shared memory block, but getting access to it is not
easy. Using a table may to easier, but you'll need to provide more
details if you want more help.



i'm trying to fix a problem related to the patch Albert sent in
october (Tablespace for temporary objects and sort files)
http://archives.postgresql.org/pgsql-patches/2006-10/msg00141.php
http://archives.postgresql.org/pgsql-patches/2007-01/msg00063.php

after reviewing this i notice this will not use different tablespaces
in the case of having various connections all with just one temp
table:
http://archives.postgresql.org/pgsql-patches/2007-01/msg00188.php

what i'm trying to do it is share what's the next_temp_tablespace to use...
the other way to do this is as you say using tables, but i don't think
this is a good idea in this case...

comments?

--
regards,
Jaime Casanova

Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning.
  Richard Cook

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] share info between backends

2007-01-11 Thread Andrew Dunstan
Jaime Casanova wrote:

 i'm trying to fix a problem related to the patch Albert sent in
 october (Tablespace for temporary objects and sort files)
 http://archives.postgresql.org/pgsql-patches/2006-10/msg00141.php
 http://archives.postgresql.org/pgsql-patches/2007-01/msg00063.php

 after reviewing this i notice this will not use different tablespaces
 in the case of having various connections all with just one temp
 table:
 http://archives.postgresql.org/pgsql-patches/2007-01/msg00188.php

 what i'm trying to do it is share what's the next_temp_tablespace to
 use...
 the other way to do this is as you say using tables, but i don't think
 this is a good idea in this case...

 comments?


Why not make it probabilistic by using, say, MyProcPid % n where n is the
number of tablespaces? Then you don't need anything shared.

cheers

andrew


---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] contrib/xml2 and xml type

2007-01-11 Thread Peter Eisentraut
Assuming a working xml type, what do you think the future of the 
contrib/xml2 module should be?

At the moment, I'd imagine that we add duplicate versions of most 
functions, where appropriate, that use the xml type instead of the text 
type.  Perhaps we should supply two sets of SQL files, so that users 
have the choice of using the legacy versions or the type-safe 
versions.  Anything else?

(I understand that some people are researching GIN-optimized XPath 
access to XML data, but that is really a bit further out.)

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] wal buffers documentation -errata

2007-01-11 Thread Dave Cramer

Currently says

Number of disk-page buffers allocated in shared memory for WAL data.  
The default is 8. The setting need only be large enough to hold the  
amount of WAL data generated by one typical transaction, since the  
data is written out to disk at every transaction commit. This  
parameter can only be set at server start.


However I just loaded up an 8.2.1 and the default is 32m

Dave

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] contrib/xml2 and xml type

2007-01-11 Thread Nikolay Samokhvalov

Duplicate versions of functions (e.g., there would be XMLPATH() as the main
XPath function for XML type, producing arrays of values of XML type in
general case -- non-standard, but generalized).

In addition to two SQL files for registration of module functions in
database, I would move XSLT functions to separate SQL file (many people do
not need XSLT, just XPath, or vice versa). Also, maybe it's worth to adjust
Makefile to make using of contrib/xml2 without XSLT (on systems w/o libxslt)
a little bit simpler (now everyone have to edit both Makefile and ...sql.in
manually).

On 1/11/07, Peter Eisentraut [EMAIL PROTECTED] wrote:


Assuming a working xml type, what do you think the future of the
contrib/xml2 module should be?

At the moment, I'd imagine that we add duplicate versions of most
functions, where appropriate, that use the xml type instead of the text
type.  Perhaps we should supply two sets of SQL files, so that users
have the choice of using the legacy versions or the type-safe
versions.  Anything else?

(I understand that some people are researching GIN-optimized XPath
access to XML data, but that is really a bit further out.)

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq





--
Best regards,
Nikolay


Re: [HACKERS] wal buffers documentation -errata

2007-01-11 Thread Tom Lane
Dave Cramer [EMAIL PROTECTED] writes:
 However I just loaded up an 8.2.1 and the default is 32m

Then you changed it in postgresql.conf.  I get

$ psql
Welcome to psql 8.2.1, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

regression=# show wal_buffers ;
 wal_buffers
-
 64kB
(1 row)

regression=#

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] contrib/xml2 and xml type

2007-01-11 Thread Nikolay Samokhvalov

On 1/11/07, Nikolay Samokhvalov [EMAIL PROTECTED] wrote:


Duplicate versions of functions (e.g., there would be XMLPATH() as the
main XPath function for XML type, producing arrays of values of XML type in
general case -- non-standard, but generalized).



Sorry :-) I wanted to say I suppose that duplicate functions is a good idea.

--
Best regards,
Nikolay


Re: [HACKERS] [COMMITTERS] pgsql: Stamp major release 8.3.0,

2007-01-11 Thread Michael Meskes
On Sat, Jan 06, 2007 at 01:37:03PM +0100, Joachim Wieland wrote:
 Attached is a patch that adds a --regression option to ecpg. I replaced the
 manual checking for long options (--version and --help) by a call to
 ...

Applied. I also changed the regression handling in other places. Guys,
please test and don't be suprised if something break on the buildfarm.
So far it's only tested on my system. It works here. :-)

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [COMMITTERS] pgsql: Stamp major release 8.3.0,

2007-01-11 Thread Tom Lane
Joachim Wieland [EMAIL PROTECTED] writes:
 On guppy the ecpg checks trigger the OpenBSD bug that Michael and Stefan
 identfied here:
 http://archives.postgresql.org/pgsql-hackers/2006-09/msg00593.php
 Not sure what to do about it, we could diff it away to get it green but it
 would not solve the problem.

It's not our problem to solve, so I'd vote for providing the alternate
test file.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] ideas for auto-processing patches

2007-01-11 Thread markwkm

On 1/4/07, Andrew Dunstan [EMAIL PROTECTED] wrote:

Gavin Sherry wrote:
 On Thu, 4 Jan 2007 [EMAIL PROTECTED] wrote:

 1. Pull source directly from repositories (cvs, git, etc.)  PLM
 doesn't really track actually scm repositories.  It requires
 directories of source code to be traversed, which are set up by
 creating mirrors.

 It seems to me that a better approach might be to mirror the CVS repo --
 or at least make that an option -- and pull the sources locally. Having to
 pull down 100MB of data for every build might be onerous to some build
 farm members.



I am not clear about what is being proposed. Currently buildfarm syncs
against (or pulls a fresh copy from, depending on configuration) either
the main anoncvs repo or a mirror (which you can get using cvsup or rsync,
among other mechanisms). I can imagine a mechanism in which we pull
certain patches from a patch server (maybe using an RSS feed, or a SOAP
call?) which could be applied before the run. I wouldn't want to couple
things much more closely than that.


I'm thinking that a SOAP call might be easier to implement?  The RSS
feed seems like it would be more interesting as I am imagining that a
buildfarm system might be able to react to new patches being added to
the system.  But maybe that's a trivial thing for either SOAP or an
RSS feed.


The patches would need to be vetted first, or no sane buildfarm owner will
want to use them.


Perhaps as a first go it can pull any patch that can be applied
without errors?  The list of patches to test can be eventually
restricted by name and who submitted them.

Regards,
Mark

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] ideas for auto-processing patches

2007-01-11 Thread Andrew Dunstan

[EMAIL PROTECTED] wrote:


I am not clear about what is being proposed. Currently buildfarm syncs
against (or pulls a fresh copy from, depending on configuration) either
the main anoncvs repo or a mirror (which you can get using cvsup or 
rsync,

among other mechanisms). I can imagine a mechanism in which we pull
certain patches from a patch server (maybe using an RSS feed, or a SOAP
call?) which could be applied before the run. I wouldn't want to couple
things much more closely than that.


I'm thinking that a SOAP call might be easier to implement?  The RSS
feed seems like it would be more interesting as I am imagining that a
buildfarm system might be able to react to new patches being added to
the system.  But maybe that's a trivial thing for either SOAP or an
RSS feed.


I'd be quite happy with SOAP. We can make SOAP::Lite an optional load 
module, so if you don't want to run patches you don't need to have the 
module available.




The patches would need to be vetted first, or no sane buildfarm owner 
will

want to use them.


Perhaps as a first go it can pull any patch that can be applied
without errors?  The list of patches to test can be eventually
restricted by name and who submitted them.




This reasoning seems unsafe. I am not prepared to test arbitrary patches 
on my machine - that seems like a perfect recipe for a trojan horse. I 
want to know that they have been vetted by someone I trust. That means 
that in order to get into the feed in the first place there has to be a 
group of trusted submitters. Obviously, current postgres core committers 
should be in that group, and I can think of maybe 5 or 6 other people 
that could easily be on it. Perhaps we should leave the selection to the 
core team.


cheers

andrew


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] -f output file option for pg_dumpall

2007-01-11 Thread Bruce Momjian
Dave Page wrote:
 Possibly, to merge the two programs. I'm intending to put some time into
 the append and seperating globals items, but I don't think I have the
 time to merge the apps given Tom's concerns and some further investigation.

Yes, I was just wondering if an append mode for Win32 would be an easy
solution and put on the TODO list.  I don't see a merge of the tools as
being a win myself.

---


 
 Regards, Dave.
 
 Bruce Momjian wrote:
  Is there a TODO here?
  
  ---
  
  Dave Page wrote:
  Dave Page wrote:
  I don't object to it in principle, but I think a bit more thought is
  needed as to what's the goal.  A stupid append option would be enough
  for pg_dumpall's current capabilities (ie, text output only) --- but is
  it reasonable to consider generalizing -Fc and -Ft modes to deal with
  multiple databases, and if so how would that need to change pg_dump's
  API?  (I'm not at all sure this is feasible, but let's think about it
  before plastering warts onto pg_dump, not after.)
  Hmm, OK. I'll need to have a good look at the code before I can even
  think about commenting on that, which will have to wait until after I've
  finished bundling releases.
  And having done so, I agree that it's not really feasible without
  significant effort to allow each archive format to be closed and
  re-opened between multiple instances of pg_dump and pg_dumpall, as well
  as to allow them to support multiple databases and global objects
  (though they can effectively live in the default DB of course) within a
  single archive. I'm fairly certain it would be easier to merge the two
  programs as originally suggested, though that does indeed look trickier
  (and more dangerous) than I originally envisaged.
 
  How about adding the append option, but leaving it undocumented. That
  way if anyone gets the itch to do a full rewrite in the future we
  haven't necessarily got to continue to support an option we no longer want?
 
  Regards, Dave.
 
 
  ---(end of broadcast)---
  TIP 7: You can help support the PostgreSQL project by donating at
 
  http://www.postgresql.org/about/donate
  
 
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] wal buffers documentation -errata

2007-01-11 Thread Tom Lane
Dave Cramer [EMAIL PROTECTED] writes:
 the point is that the documentation suggests that the default is 8
 not 8MB, but 8, when in fact the defaults are now given in memory  
 units not pages

Oh, I thought you were complaining that the value was numerically wrong.

Perhaps we should convert the documentation to show the defaults in a
units-ified way, but if so it needs to be done consistently.  Most of
the entries seem not to have been changed; for example shared_buffers
is still described in blocks.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Load distributed checkpoint

2007-01-11 Thread Inaam Rana


No, I've not tried yet.  Inaam-san told me that Linux had a few I/O
schedulers but I'm not familiar with them.  I'll find information
about them (how to change the scheduler settings) and try the same
test.



I am sorry, your response just slipped by me. The docs for RHEL (I believe
you are running RHEL which has 2.6.9 kernel) say that it does support
selectable IO scheduler.

http://www.redhat.com/rhel/details/limits/

I am not sure where else to look for scheduler apart from /sys

regards,
inaam


Re: [HACKERS] pgindent infelicity

2007-01-11 Thread Bruce Momjian
Tom Lane wrote:
 I notice that the latest pgindent run has decided that comments attached
 to else should be moved onto the next line, as in this example in
 src/bin/psql/mbprint.c:
 
 {
 linewidth += 4;
 format_size += 4;
 }
 -   else  /* Output itself */
 +   else
 +   /* Output itself */
 {
 linewidth++;
 format_size += 1;
 }
 
 I find this pretty ugly; can it be fixed?

FYI, I fixed this soon after you reported it.  It will now only push
down multi-line comments after 'else'.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] wal buffers documentation -errata

2007-01-11 Thread Dave Cramer


On 11-Jan-07, at 12:49 PM, Tom Lane wrote:


Dave Cramer [EMAIL PROTECTED] writes:

the point is that the documentation suggests that the default is 8
not 8MB, but 8, when in fact the defaults are now given in memory
units not pages


Oh, I thought you were complaining that the value was numerically  
wrong.


Perhaps we should convert the documentation to show the defaults in a
units-ified way, but if so it needs to be done consistently.  Most of
the entries seem not to have been changed; for example shared_buffers
is still described in blocks.


Yes, everything is described in blocks, but in the configuration file  
everything (I've looked at so far) is specified in memory units.
While I appreciate the effort that went into making it somewhat  
easier to use memory units I can see this being very confusing for  
the average user.


I would suggest that the documentation needs to be consistent with  
the example configuration file installed by initdb


Dave


regards, tom lane

---(end of  
broadcast)---

TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] -f output file option for pg_dumpall

2007-01-11 Thread Neil Conway
On Fri, 2007-01-05 at 17:52 -0500, Tom Lane wrote:
 I think this will be an exercise in time-wasting, and very possibly
 destabilize *both* tools.  pg_dump has never been designed to reconnect
 to a different database; for instance there isn't any code for resetting
 all the internal state that it gathers.

That is merely an implementation issue. The question of whether pg_dump
and pg_dumpall should be separate programs is a question of design,
IMHO.

I don't think they need to be integrated any time soon, but if we were
to design pg_dump and pg_dumpall from scratch, it seems more logical to
use a single program, and therefore that is the long-term direction I
think we should head in.

-Neil



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Tom Lane
Patrick Earl [EMAIL PROTECTED] writes:
 In any case, the unit tests remove all contents and schema within the
 database before starting, and they remove the tables they create as
 they proceed.  Certainly there are many things have been recently
 deleted.

Yeah, I think then there's no question that the bgwriter is trying to
fsync something that's been deleted but isn't yet closed by every
process.  We have things set up so that that's not a really serious
problem anymore --- eventually it will be closed and then the next
checkpoint will succeed.  But CREATE DATABASE insists on checkpointing
and so it's vulnerable to even a transient failure.

I've been resisting changing the checkpoint code to treat EACCES as a
non-error situation on Windows, but maybe we have no choice.  How do
people feel about this idea: #ifdef WIN32 and the open or fsync fails
with EACCES, then

1. Emit a LOG (or maybe DEBUG) message noting the problem.
2. Leave the fsync request entry in the hashtable for next time.
3. Allow the current checkpoint to complete normally anyway.

If the file has actually been deleted, then eventually it will be closed
and the next checkpoint will be able to remove the hash entry.  If
there's something else wrong, we'll keep bleating and maybe the DBA will
notice eventually.

The downside of this is that a real EACCES problem wouldn't get noted at
any level higher than LOG, and so you could theoretically lose data
without much warning.  But I'm not seeing anything else we could do
about it --- AFAIK we have not heard of a way we can distinguish this
case from a real permissions problem.  And anyway there should never
*be* a real permissions problem; if there is then the user's been poking
under the hood sufficient to void the warranty anyway ;-)

Comments?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Some notes about redesigning planner data structures

2007-01-11 Thread Tom Lane
I've been looking at improving the planner so that it can handle things
like backwards-order mergejoins, and I'm starting to realize that the
old assumption that mergejoinable operators had only one associated sort
ordering is wired into even more places than I thought.  In particular,
the PathKeys data structure (see src/backend/optimizer/README if you
don't know about it) seems to need revisions.  As it stands we'd have to
generate a lot of redundant PathKeys.

What I'm toying with doing is splitting PathKeys into two layers of data
structure.  The lower layer would take over the function of representing
that we know a bunch of variables have been equated to each other, and
the upper layer would handle the task of representing a specific sort
order.  This would be implemented as two new node types:

EquivalenceClass: contains a btree opfamily OID and a list of
expressions.  This asserts that all the expressions have been equated by
mergejoinable operators in that opfamily, so we can transitively
conclude that they are all equal (according to that opfamily's notion of
equality).  We might wish to make the list include not just the raw
expressions but additional data (eg their relation memberships), to
ease searching the list.

PathKey: contains a pointer to an EquivalenceClass, a sort direction
(BTLessStrategyNumber or BTGreaterStrategyNumber), and a nulls-first flag.
This represents a single-column sort ordering.  We continue to represent
the total ordering of a Path as a list of PathKeys.

A possible objection to this is that it ties the planner's handling of
sort ordering even more tightly to btree, but actually I think that's
not a big problem.  We could handle opfamilies belonging to multiple
orderable index types as long as they all use btree's strategy numbers.
In any case, with no new orderable index types on the horizon, I'm not
all that worried about this.

During any one planner run, we will keep all the EquivalenceClasses and
PathKeys that have been created in lists dangling from PlannerInfo, and
be careful not to make duplicate PathKey objects.  This allows us to
keep using simple pointer equality to compare PathKeys.  This means that
we have to finish merging EquivalenceClasses before we can make any
PathKeys, but I think that will be OK.

The RestrictInfo for a mergejoinable opclause will no longer store
PathKeys for its left and right sides, but rather EquivalenceClass
references.  (In the simple case the left and right EquivalenceClasses
would be the same class, but if we couldn't consider the opclause an
equijoin, they'd be different classes, possibly with only one member.
This is really the same thing that happens now with PathKeys.)

One issue is that the same operator could possibly be equality in more
than one opfamily.  We could generate separate EquivalenceClasses for
each such opfamily and store lists of pointers in the RestrictInfos, but
that seems a bit messy.  An alternative is to allow a list of opfamily
OIDs in an EquivalenceClass, but then there comes the question of what
to do if some equality operators contributing to the EC are members of
more opfamilies than others.  Can we legitimately conclude that any such
omissions are oversights and assume the entries should have been there?
If so, we could take the union of the observed opfamily lists as the
opfamily list for the EquivalenceClass.  If not, we could just not
combine EquivalenceClasses for operators that have different opfamily
membership sets, but that would lose some amount of useful knowledge.

An idea that seems really attractive if we do this is to get rid of the
explicit generate_implied_equalities step, in favor of dynamically
generating an appropriate join condition whenever a join between two
relations having elements in an EquivalenceClass is made.  The
RestrictInfos for the original clauses giving rise to the EC wouldn't
get put into join condition lists directly, only indirectly through this
process.  This'd eliminate the need for detecting and removing redundant
clauses as we currently do it, since only one of the possible join
conditions associated with a particular EquivalenceClass would be
generated and inserted into a join condition list.

One of the things I don't like about generate_implied_equalities is that
it has to fail if there's no cross-type equality operator for a
particular datatype combination.  Currently we tell people they'd better
make sure that mergejoinable operators come in complete cross-type sets,
but that's not real attractive.  This approach can improve the
situation: rather than failing if we can't generate *all* the equality
combinations implied by a particular equivalence set, we need only fail
if we can't generate *any* of the valid combinations for a particular
join.  What's more, failure need no longer mean elog(ERROR), it just
means we reject that particular join path as invalid.  (We can be sure
we will still be able to find some solution to the join problem, since
at 

Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 I find it very unlikely that you would during normal operations end up
 in a situation where you would first have permissions to create files in
 a directory, and then lose them.
 What could be is that you have a directory where you never had
 permissions to create the file in the first place.

 Any chance to differentiate between these?

The cases we're concerned about involve access to an existing file, not
attempts to create a new one, so I'm not clear what your point is.

I would certainly *love* to differentiate between these failures and
ordinary permissions failures, but so far as I've heard we can't.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Problem linking libecpg.5.3.dylib on OS X

2007-01-11 Thread Jim C. Nasby
I'm seeing the following on cuckoo:

gcc -pipe -no-cpp-precomp -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Winline -Wdeclaration-after-statement -Wendif-labels
-fno-strict-aliasing -g   -dynamiclib -install_name
/Users/buildfarm/buildfarm/HEAD/inst/lib/libecpg.5.dylib
-compatibility_version 5 -current_version 5.3  -multiply_defined
suppress  execute.o typename.o descriptor.o data.o error.o prepare.o
memory.o connect.o misc.o path.o  thread.o -L../pgtypeslib
-L../../../../src/interfaces/libpq -L../../../../src/port
-L/opt/local/lib -lpgtypes -lpq -lm  -o libecpg.5.3.dylib
ld: Undefined symbols:
_ecpg_internal_regression_mode
/usr/bin/libtool: internal link edit command failed
make[4]: *** [libecpg.5.3.dylib] Error 1

http://lnk.nu/pgbuildfarm.org/cxd.pl 

Any suggestions? Google and the archives aren't turning anything up :(
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 I find it very unlikely that you would during normal operations end up
 in a situation where you would first have permissions to create files in
 a directory, and then lose them.
 What could be is that you have a directory where you never had
 permissions to create the file in the first place.
 
 Any chance to differentiate between these?
 
 The cases we're concerned about involve access to an existing file, not
 attempts to create a new one, so I'm not clear what your point is.

Well, then I don't see it as being a big problem, which was the
question, I think. If pgsql had permissions to create the file, it would
never lose it unless the dba changed something - and if the dba changed
something, then he should check his logs afterwards to make sure he
didn't break anything.

My point is that if we know that *we* could create the file, than the
probability of it being an *actual* permissions problem is very low
during normal operations. So it's most likely the delete issue, and
thus doing what you propose does seem like a fairly safe bet.


 I would certainly *love* to differentiate between these failures and
 ordinary permissions failures, but so far as I've heard we can't.

Right, that's the base problem.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Jim C. Nasby
On Thu, Jan 11, 2007 at 04:32:42PM -0500, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  Given that this could result in data loss, if this was to be done I'd
  very much want to see a way to disable it in a production environment.
 
 Production environments are the same ones that won't be happy with
 random checkpoint failures, either.

Maybe I'm not understanding what happens in a checkpoint failure, but
I'd rather have my pg_xlog fill up (hopefully with a lot af WARNINGS
thrown before-hand) and face a long recovery than lose data...

 If we can't find a way to positively identify the deleted-file failures
 then I think we've got to do something like this.
 
 (You know, of course, that my opinion is that no sane person would run a
 production database on Windows in the first place.  So the data-loss
 risk to me seems less of a problem than the unexpected-failures problem.
 It's not like there aren't a ton of other data-loss scenarios in that OS
 that we can't do anything about...)

Yeah, and I share your opinion. Unfortunately, a lot of others do not.
:(

It would be useful if we had a page somewhere that explained in detail
what these data-loss issues were and why they're out of our control. At
least then people would (or could...) understand why production +
Windows == BadIdea.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Richard Troy

On Thu, 11 Jan 2007, Tom Lane wrote:

...snip...

 (You know, of course, that my opinion is that no sane person would run a
 production database on Windows in the first place.  So the data-loss
 risk to me seems less of a problem than the unexpected-failures problem.
 It's not like there aren't a ton of other data-loss scenarios in that OS
 that we can't do anything about...)

   regards, tom lane


PLEASE OH PLEASE document every f-ing one of them! (And I don't mean
document Windows issues as comments in the source code. Best would be in
the official documentation/on a web page.) On occasion, I could *really*
use such a list! (If such already exists, please point me at it!)

Thing is, Tom, not everybody has the same level of information you have on
the subject...

Regards,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] copy table from file: with row replacement?

2007-01-11 Thread Michael Enke

Hello all,
I have a feature request as I think it is not possible with the actual version:

I want to load huge amount of data and I know that COPY is much faster than 
doing inserts.
But in my case I have an already filled table and rows (not all, only partly) 
from this table
should be replaced. The table has a primary key for one column.
If I do a COPY table FROM file and the key value already exists, postgresql 
tells me
that the import is not possible because of the violation of the PK.

If postgres is aware of such a violation, couldn't there be an option to the 
COPY command
to delete such existing rows so that a COPY table FROM file will never generate 
a PK violation message
but replaces existing rows?

If this is not possible, would it be the next fastes solution to create a 
before trigger and to
delete rows in this trigger? Or is this not different from issuing for every 
line an insert
and if this fails (because of the PK) than an update?

Thank you,
Michael

PS: Please CC to my email

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Some notes about redesigning planner data structures

2007-01-11 Thread Martijn van Oosterhout
On Thu, Jan 11, 2007 at 04:03:55PM -0500, Tom Lane wrote:
 I've been looking at improving the planner so that it can handle things
 like backwards-order mergejoins, and I'm starting to realize that the
 old assumption that mergejoinable operators had only one associated sort
 ordering is wired into even more places than I thought.  In particular,
 the PathKeys data structure (see src/backend/optimizer/README if you
 don't know about it) seems to need revisions.  As it stands we'd have to
 generate a lot of redundant PathKeys.

snip much mind-blowing stuff

For the parts I understand, I agree. This is something that long-term
will allow the planner to make smarter decisions about relations
between different types. And if in the future we ever implement
COLLATE, I think we're creating the right level of abstraction here.

 A possible objection to this is that it ties the planner's handling of
 sort ordering even more tightly to btree, but actually I think that's
 not a big problem.  We could handle opfamilies belonging to multiple
 orderable index types as long as they all use btree's strategy numbers.
 In any case, with no new orderable index types on the horizon, I'm not
 all that worried about this.

No problem here, the btree structure is portgresql representation of
the concept of order so it's logical it gets tied in everywhere.

 One of the things I don't like about generate_implied_equalities is that
 it has to fail if there's no cross-type equality operator for a
 particular datatype combination.  Currently we tell people they'd better
 make sure that mergejoinable operators come in complete cross-type sets,
 but that's not real attractive.  This approach can improve the
 situation: rather than failing if we can't generate *all* the equality
 combinations implied by a particular equivalence set, we need only fail
 if we can't generate *any* of the valid combinations for a particular
 join.  What's more, failure need no longer mean elog(ERROR), it just
 means we reject that particular join path as invalid.  (We can be sure
 we will still be able to find some solution to the join problem, since
 at least the join path directly implied by the original clauses will
 work.)

Sounds great...

PS. I'm glad you're doing this, because I wouldn't know where to
start... Keep up the good work!

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


[HACKERS] O_DIRECT, or madvise and/or posix_fadvise

2007-01-11 Thread markwkm

I caught this thread about O_DIRECT on kerneltrap.org:
 http://kerneltrap.org/node/7563

It sounds like there is much to be gained here in terms of reducing
the number of user/kernel space copies in the operating system.  I got
the impression that posix_fadvise in the Linux kernel isn't as good as
it could be.  I noticed in xlog.c that the use of posix_fadvise is
disabled.  Maybe it's time to do some more experimenting and working
with the Linux kernel developers.  Or perhaps there is another OS that
would be better to experiment with?

Not sure where to start but do people think this is worth taking a stab at?

Regards,
Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Andrew Dunstan

Richard Troy wrote:

On Thu, 11 Jan 2007, Tom Lane wrote:

...snip...
  

(You know, of course, that my opinion is that no sane person would run a
production database on Windows in the first place.  So the data-loss
risk to me seems less of a problem than the unexpected-failures problem.
It's not like there aren't a ton of other data-loss scenarios in that OS
that we can't do anything about...)





PLEASE OH PLEASE document every f-ing one of them! (And I don't mean
document Windows issues as comments in the source code. Best would be in
the official documentation/on a web page.) On occasion, I could *really*
use such a list! (If such already exists, please point me at it!)

Thing is, Tom, not everybody has the same level of information you have on
the subject...


  



Please don't. At least not on the PostgreSQL web site nor in the docs. 
And no, I don't run my production servers on Windows either.


For good or ill, we made a decision years ago to do a proper Windows 
port. I think that it's actually worked out reasonably well. All 
operating systems have warts. Not long ago I tended to advise people not 
to run mission critical Postgresql on Linux unless they were *very* 
careful, due to the over-commit issue.


In fact, I don't trust any OS. I use dumps and backups and replication 
to protect myself from them all.


In the present instance, the data loss risk is largely theoretical, as I 
understand it, as we don't expect a genuine EACCESS error.


cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Joshua D. Drake
 
 

 
 
 Please don't. At least not on the PostgreSQL web site nor in the docs. 
 And no, I don't run my production servers on Windows either.

It does seem like it might be a good idea to have FAQs based on each OS,
yes? There are various things that effect each OS differently. The most
obvious to me being shared memory and wal_sync_method.

If could be a good idea to have.

Joshua D. Drake

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Recent ecpg patch...

2007-01-11 Thread Joachim Wieland
On Thu, Jan 11, 2007 at 09:59:14PM +0100, Magnus Hagander wrote:
 .. appears to have killed win32. It did kill my manual MSVC builds, but
 it also seems to have killed win32 buildfarm members yak and snake:
 http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=yakdt=2007-01-11%2020:32:11
 http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=snakedt=2007-01-11%2018:30:01

 (same error on mingw and msvc)

ecpg_internal_regression_mode has to be declared at least in one file without
the extern keyword.

With the attached patch I get a clean build again with MSVC.


Joachim
? .deps
? libecpg.so.5.3
Index: misc.c
===
RCS file: /projects/cvsroot/pgsql/src/interfaces/ecpg/ecpglib/misc.c,v
retrieving revision 1.33
diff -u -r1.33 misc.c
--- misc.c	11 Jan 2007 15:47:33 -	1.33
+++ misc.c	11 Jan 2007 23:25:29 -
@@ -28,7 +28,7 @@
 #endif
 #endif
 
-extern int ecpg_internal_regression_mode;
+int ecpg_internal_regression_mode;
 
 static struct sqlca_t sqlca_init =
 {

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread jam
On Thu, Jan 11, 2007 at 03:12:07PM -0800, Joshua D. Drake wrote:
 It does seem like it might be a good idea to have FAQs based on each OS,
 yes? There are various things that effect each OS differently. The most
 obvious to me being shared memory and wal_sync_method.
 
 If could be a good idea to have.
 
 Joshua D. Drake
 

+1

regards,
J

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [COMMITTERS] pgsql: Stamp major release 8.3.0,

2007-01-11 Thread Joachim Wieland
On Thu, Jan 11, 2007 at 01:15:56PM +0100, Magnus Hagander wrote:
 Can't comment on that one, since I just noticed it existed. How similar
 was this one to the standard regression tests? Those were moved into a
 C executable so they'd run on a Windows system without a shell, could
 the same be done relatively easilyi with this one?

 (Obviously we can't run the ecpg regression tests on msvc builds now -
 oops, didn't know those had their own script)

The ecpg regression tests came in when you started to rewrite the old
regression script. Actually we exchanged some e-mails about this topic at
that time :-)

To get ecpg regression tests on msvc we could either convert the script to
a .c file as well or think about a general regression test library that
could be used by contrib or pgfoundry modules as well. Does anybody have
pointers to the archives on this topic?


Joachim




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Simon Riggs
On Wed, 2007-01-10 at 23:32 -0500, Bruce Momjian wrote:
 Simon Riggs wrote:
  On Fri, 2007-01-05 at 22:57 -0500, Tom Lane wrote:
   Jim Nasby [EMAIL PROTECTED] writes:
On Jan 5, 2007, at 6:30 AM, Zeugswetter Andreas ADI SD wrote:
Ok, so when you need CRC's on a replicate (but not on the master) you
   
Which sounds to me like a good reason to allow the option in  
recovery.conf as well...
   
   Actually, I'm not seeing the use-case for a slave having a different
   setting from the master at all?
   
 My backup server is less reliable than the primary.
   
 My backup server is more reliable than the primary.
   
   Somehow, neither of these statements seem likely to be uttered by
   a sane DBA ...
  
  If I take a backup of a server and bring it up on a new system, the
  blocks in the backup will not have been CRC checked before they go to
  disk.
  
  If I take the same server and now stream log records across to it, why
  *must* that data be CRC checked, when the original data has not been?
  
  I'm proposing choice, with a safe default. That's all.
 
 I am assuming this item is dead because no performance results have been
 reported.

It's been on my hold queue, as a result of its lack of clear acceptance.

Results from earlier tests show the routines which are dominated by CRC
checking overhead are prominent in a number of important workloads.
Those workloads all have a substantial disk component, so test results
will vary between no saving at all on a disk-bound system to some
savings on a CPU bound system.

Restore RecordIsValid() #1 on oprofile results at 50-70% CPU

COPYXLogInsert() #1 on oprofile results at 17% CPU
(full_page_writes = on)

OLTPno test with full_page_writes = on (less relevant)

OLTPXLogInsert() #5 on oprofile results at 1.2% CPU
(full_page_writes = off)
Removing the CRC checks on WAL would likely be the easiest to remove 1%
CPU on the system as it stands. Other changes require algorithmic or
architectural changes to improve matters, though gains can be much
larger. 1% doesn't sound much, but PostgreSQL is a very sleek beast
these days. As we improve things in other areas the importance of this
patch as a tuning switch will grow.

Clearly the current patch is not accepted, but can we imagine a patch
that saved substantial CPU time in these areas that would be acceptable?
*Always* as a non-default option, IMHO, with careful documentation as to
its possible use.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 COPY  XLogInsert() #1 on oprofile results at 17% CPU
   (full_page_writes = on)

But what portion of that is actually CRC-related?  XLogInsert does quite
a lot.

Anyway, I can't see degrading the reliability of the system for a gain
in the range of a few percent, which is the most that we'd be likely
to get here ... for a factor of two or more, maybe people would be
willing to take a risk.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Simon Riggs
On Thu, 2007-01-11 at 09:01 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  COPYXLogInsert() #1 on oprofile results at 17% CPU
  (full_page_writes = on)
 
 But what portion of that is actually CRC-related?  XLogInsert does quite
 a lot.
 
 Anyway, I can't see degrading the reliability of the system for a gain
 in the range of a few percent, which is the most that we'd be likely
 to get here ... for a factor of two or more, maybe people would be
 willing to take a risk.

All I would add is that the loss of reliability was not certain in all
cases, otherwise I myself would have dropped the idea long ago. With the
spectre of doubt surrounding this, I'm happy to drop the idea until we
have proof/greater certainty either way.

Patch revoked.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 What did you think about protecting against torn writes using id numbers every
 512 bytes.

Pretty much not happening; or are you volunteering to fix every part of
the system to tolerate injections of inserted data anywhere in a stored
datum?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Pretty much not happening; or are you volunteering to fix every part of
 the system to tolerate injections of inserted data anywhere in a stored
 datum?

 I was thinking to do it at a low level as the xlog records are prepared to be
 written to the filesystem and as the data is being read from disk. I haven't
 read that code yet to see where to inject it but I understand there's already
 a copy happening and it could be done there.

You understand wrong ... a tuple sitting on disk is normally read
directly from the shared buffer, and I don't think we want to pay for
copying it.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Pretty much not happening; or are you volunteering to fix every part of
 the system to tolerate injections of inserted data anywhere in a stored
 datum?

 I was thinking to do it at a low level as the xlog records are prepared to be
 written to the filesystem and as the data is being read from disk. I haven't
 read that code yet to see where to inject it but I understand there's already
 a copy happening and it could be done there.

 You understand wrong ... a tuple sitting on disk is normally read
 directly from the shared buffer, and I don't think we want to pay for
 copying it.

xlog records


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 You understand wrong ... a tuple sitting on disk is normally read
 directly from the shared buffer, and I don't think we want to pay for
 copying it.

 xlog records

Oh, sorry, had the wrong context in mind.  I'm still not very impressed
with the idea --- a CRC check will catch many kinds of problems, whereas
this approach catches exactly one kind of problem.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Oh, sorry, had the wrong context in mind.  I'm still not very impressed
 with the idea --- a CRC check will catch many kinds of problems, whereas
 this approach catches exactly one kind of problem.

Well in fairness I tossed in a throwaway comment at the end of that email
about heap pages. I'll do the same here since I can't resist. But the main
thread here is about xlog really.

It just seems to me like it's better to target each problem with a solution
that addresses it directly than have one feature that we hope hope addresses
them all more or less.

Having a CRC in WAL but not in the heap seems kind of pointless. If your
hardware is unreliable the corruption could anywhere. Depending on it to solve
multiple problems means we can't offer the option to disable it because it
would affect other things as well.

What I would like to see is a CRC option that would put CRC checks in every
disk page whether heap, index, WAL, control file, etc. I think we would
default that to off to match our current setup most closely.

Separately we would have a feature in WAL to detect torn pages so that we can
reliably detect the end of valid WAL. That would have to always be on. But
having it as a separate feature means the CRC could be optional.

Also, incidentally like I mentioned in my previous email, we could do the torn
page detection in heap pages too by handling it in the smgr using
readv/writev. No copies, no corrupted datums. Essentially the tags would be
inserted on the fly as the data was copied into kernel space.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Patch to log usage of temporary files

2007-01-11 Thread Simon Riggs
On Tue, 2007-01-09 at 17:16 -0500, Bruce Momjian wrote:
 Tom Lane wrote:

 /* reset flag so that die() interrupt won't cause 
   problems */
 vfdP-fdstate = ~FD_TEMPORARY;
   + PG_TRACE1(temp__file__cleanup, vfdP-fileName);
   + if (log_temp_files = 0)
   + {
   + if (stat(vfdP-fileName, filestats) == 0)
  
  The TRACE is in the wrong place no?  I thought it was going to be after
  the stat() operation so it could pass the file size.

We had that discussion already. If you only pass it after the stat()
then you cannot use DTrace, except when you already get a message in the
log and therefore don't need DTrace. DTrace can get the filesize if it
likes, but thats up to the script author.

  Also, I dunno much about DTrace, but I had the idea that you can't
  simply throw a PG_TRACE macro into the source and think you are done
  --- isn't there a file of probe declarations to add to?  Not to mention
  the documentation of what probes exist.
 
 I didn't like the macro in that area anyway.  It seems too adhock to
 just throw it in when we have so few places monitored now.  Removed.

err... why are we removing it? The patch should have included an
addition to the probes.d file also, but that should be fixed, not
removed. Don't we normally reject incomplete patches?

You can't say we don't have many probes so we won't add one. There never
will be many if we do that - its a circular argument.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Patch to log usage of temporary files

2007-01-11 Thread Bruce Momjian
Simon Riggs wrote:
   Also, I dunno much about DTrace, but I had the idea that you can't
   simply throw a PG_TRACE macro into the source and think you are done
   --- isn't there a file of probe declarations to add to?  Not to mention
   the documentation of what probes exist.
  
  I didn't like the macro in that area anyway.  It seems too adhock to
  just throw it in when we have so few places monitored now.  Removed.
 
 err... why are we removing it? The patch should have included an
 addition to the probes.d file also, but that should be fixed, not
 removed. Don't we normally reject incomplete patches?
 
 You can't say we don't have many probes so we won't add one. There never
 will be many if we do that - its a circular argument.

The trace probe was incorrect and kind of at an odd place.  I don't
think we want to go down the road of throwing trace in everwhere, do we?
I would like to see a more systematic approach to it.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] Patch to log usage of temporary files

2007-01-11 Thread Simon Riggs
On Thu, 2007-01-11 at 12:35 -0500, Tom Lane wrote: 
 Simon Riggs [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  The TRACE is in the wrong place no?  I thought it was going to be after
  the stat() operation so it could pass the file size.
 
  We had that discussion already. If you only pass it after the stat()
  then you cannot use DTrace, except when you already get a message in the
  log and therefore don't need DTrace.
 
 We may have had the discussion, but apparently you didn't follow it :-(.

My apologies.

  You can't say we don't have many probes so we won't add one. There never
  will be many if we do that - its a circular argument.
 
 I think the real criterion has to be is this probe useful to
 developers?.  I'm entirely uninterested in adding probes that are
 targeted towards DBAs, as this one would have been --- if we think
 there's a problem that a DBA would have, we need to offer a more
 portable solution than that.  Which we did, in the form of a logging
 option, which makes the DTrace probe pretty useless anyway.

Well, you know my major objection to including DTrace was all to do with
portability. I'm happy that the way its been implemented allows other
solutions to take advantage of the trace points also.

We're working on 8.3 now and by the time that is delivered and perhaps
for 2 years hence, i.e. Aug 2009, the software will be in production
use. In that period, DTrace will have been ported more widely and I'm
hearing that some kind of user space solution for Linux will mature in
that time also. If that isn't true then I'll be more interested in some
custom tracing solutions built around the PG_TRACE macro concept.

My thought is to provide both a log-based trace solution as has been
done, plus a hook for PG_TRACE (not just DTrace) at the same time. i.e.
each time we enhance the logging infrastructure, take the time to place
a trace point there also.

Theologically, we both know we see things differently on the DBA v
Developer discussion. The only point I would make is that the more
information you give the DBA, the more comes back to you as a Developer.
You, personally, could not possibly have interacted with as many server
set-ups required to highlight the problems and issues you address. It's
only because of the info provided by the existing system that you're
able to make headway with rare optimizer problems. My perspective is
that if you help the DBA you also help the Developer; if you help the
Developer only, then the Developer's information is also inevitably
restricted. The tip says EXPLAIN ANALYZE is your friend. It's right,
and it isn't just talking to DBAs. My feeling is that this is true for
all tools/trace mechanisms.

I'd rather be sent the info than have to go do it myself on an
individual basis. Indirect access isn't the best way, but we harvest a
much wider range of information that way.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] wal_checksum = on (default) | off

2007-01-11 Thread Simon Riggs
On Thu, 2007-01-11 at 17:06 +, Gregory Stark wrote:
 Having a CRC in WAL but not in the heap seems kind of pointless. 

Yes...

 If your
 hardware is unreliable the corruption could anywhere. 

Agreed.

Other DBMS have one setting for the whole server; I've never seen
separate settings for WAL and data.

 Depending on it to solve
 multiple problems means we can't offer the option to disable it
 because it
 would affect other things as well.
 
 What I would like to see is a CRC option that would put CRC checks in
 every
 disk page whether heap, index, WAL, control file, etc. I think we
 would
 default that to off to match our current setup most closely.
 
 Separately we would have a feature in WAL to detect torn pages so that
 we can
 reliably detect the end of valid WAL. That would have to always be on.
 But
 having it as a separate feature means the CRC could be optional.

Your thoughts seem logical to me.

It does seem a bigger project than I'd envisaged, but doable, one day.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] table partioning performance

2007-01-11 Thread Simon Riggs
On Wed, 2007-01-10 at 16:00 -0500, Steven Flatt wrote:
 On 1/9/07, Simon Riggs [EMAIL PROTECTED] wrote: 
 If you are doing date range partitioning it should be fairly
 simple to
 load data into the latest table directly. That was the way I
 originally 
 intended for it to be used. The rules approach isn't something
 I'd
 recommend as a bulk loading option and its a lot more complex
 anyway.
 The problem we have with blindly loading all data into the latest
 table is that some data ( 5%, possibly even much less) is actually
 delivered late and belongs in earlier partitions.  So we still
 needed the ability to send data to an arbitrary partition.

Yes, understand the problem.

COPY is always going to be faster than INSERTs anyhow and COPY doesn't
allow views, nor utilise rules. You can set up a client-side program to
pre-qualify the data and feed it to multiple simultaneous COPY commands,
as the best current way to handle this.

--
Next section aimed at pgsql-hackers, relates directly to above:


My longer term solution looks like this:

1. load all data into newly created partition (optimised in a newly
submitted patch for 8.3), then add the table as a new partition

2. use a newly created, permanent errortable into which rows that
don't match constraints or have other formatting problems would be put.
Following the COPY you would then run an INSERT SELECT to load the
remaining rows from the errortable into their appropriate tables. The
INSERT statement could target the parent table, so that rules to
distribute the rows would be applied appropriately. When all of those
have happened, drop the errortable. This would allow the database to
apply its constraints accurately without aborting the load when a
constraint error occurs.

In the use case you outline this would provide a fast path for 95% of
the data load, plus a straightforward mechanism for the remaining 5%.

We discussed this on hackers earlier, though we had difficulty with
handling unique constraint errors, so the idea was shelved. The
errortable part of the concept was sound however.
http://archives.postgresql.org/pgsql-hackers/2005-11/msg01100.php
James William Pye had a similar proposal
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00120.php

The current TODO says
Allow COPY to report error lines and continue 
This requires the use of a savepoint before each COPY line is processed,
with ROLLBACK on COPY failure.

If we agreed that the TODO actually has two parts to it, each of which
is separately implementable:
1. load errors to a table (all errors apart from uniqueness violation)
2. do something sensible with unique violation ERRORs

IMHO part (1) can be implemented without Savepoints, which testing has
shown (see James' results) would not be an acceptable solution for bulk
data loading. So (1) can be implemented fairly easily, whereas (2)
remains an issue that we have no acceptable solution for, as yet.

Can we agree to splitting the TODO into two parts? That way we stand a
chance of getting at least some functionality in this important area.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Tom Lane
Kim [EMAIL PROTECTED] writes:
 We were running on 8.1.1 previous to upgrading to 8.2, and yes, we 
 definitely have a heafty pg_class. The inheritance model is heavily used 
 in our schema (the results of the group by you wanted to see are down 
 below).  However, no significant problems were seen with vacs while we 
 were on 8.1.

Odd, because the 8.1 code looks about the same, and it is perfectly
obvious in hindsight that its runtime is about O(N^2) in the number of
relations :-(.  At least that'd be the case if the stats collector
output were fully populated.  Did you have either stats_block_level or
stats_row_level turned on in 8.1?  If not, maybe the reason for the
change is that in 8.2, that table *will* be pretty fully populated,
because now it's got a last-vacuum-time entry that gets made even if the
stats are otherwise turned off.  Perhaps making that non-disablable
wasn't such a hot idea :-(.

What I think we need to do about this is

(1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
of using a hash table for the OIDs instead of a linear list.  Should be
a pretty small change; I'll work on it today.

(2) Reconsider whether last-vacuum-time should be sent to the collector
unconditionally.

Comments from hackers?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Alvaro Herrera
Tom Lane wrote:

 What I think we need to do about this is
 
 (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
 of using a hash table for the OIDs instead of a linear list.  Should be
 a pretty small change; I'll work on it today.
 
 (2) Reconsider whether last-vacuum-time should be sent to the collector
 unconditionally.

(2) seems a perfectly reasonably answer, but ISTM (1) would be good to
have anyway (at least in HEAD).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Simon Riggs
On Thu, 2007-01-11 at 14:45 -0500, Tom Lane wrote:
 Kim [EMAIL PROTECTED] writes:
  We were running on 8.1.1 previous to upgrading to 8.2, and yes, we 
  definitely have a heafty pg_class. The inheritance model is heavily used 
  in our schema (the results of the group by you wanted to see are down 
  below).  However, no significant problems were seen with vacs while we 
  were on 8.1.
 
 Odd, because the 8.1 code looks about the same, and it is perfectly
 obvious in hindsight that its runtime is about O(N^2) in the number of
 relations :-(.  At least that'd be the case if the stats collector
 output were fully populated.  Did you have either stats_block_level or
 stats_row_level turned on in 8.1?  If not, maybe the reason for the
 change is that in 8.2, that table *will* be pretty fully populated,
 because now it's got a last-vacuum-time entry that gets made even if the
 stats are otherwise turned off.  Perhaps making that non-disablable
 wasn't such a hot idea :-(.
 
 What I think we need to do about this is
 
 (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
 of using a hash table for the OIDs instead of a linear list.  Should be
 a pretty small change; I'll work on it today.
 
 (2) Reconsider whether last-vacuum-time should be sent to the collector
 unconditionally.
 
 Comments from hackers?

It's not clear to me how this fix will alter the INSERT issue Kim
mentions. Are those issues connected? Or are you thinking that handling
stats in a tight loop is slowing down other aspects of the system?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] table partioning performance

2007-01-11 Thread Jim C. Nasby
On Thu, Jan 11, 2007 at 12:15:50PM +, Simon Riggs wrote:
 On Wed, 2007-01-10 at 16:00 -0500, Steven Flatt wrote:
  On 1/9/07, Simon Riggs [EMAIL PROTECTED] wrote: 
  If you are doing date range partitioning it should be fairly
  simple to
  load data into the latest table directly. That was the way I
  originally 
  intended for it to be used. The rules approach isn't something
  I'd
  recommend as a bulk loading option and its a lot more complex
  anyway.
  The problem we have with blindly loading all data into the latest
  table is that some data ( 5%, possibly even much less) is actually
  delivered late and belongs in earlier partitions.  So we still
  needed the ability to send data to an arbitrary partition.
 
 Yes, understand the problem.
 
 COPY is always going to be faster than INSERTs anyhow and COPY doesn't
 allow views, nor utilise rules. You can set up a client-side program to
 pre-qualify the data and feed it to multiple simultaneous COPY commands,
 as the best current way to handle this.
 
 --
 Next section aimed at pgsql-hackers, relates directly to above:

I'm wondering if you see any issues with COPYing into a partitioned
table that's using triggers instead of rules to direct data to the
appropriate tables?

BTW, I think improved copy error handling would be great, and might
perform better than triggers, once we have it...
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 It's not clear to me how this fix will alter the INSERT issue Kim
 mentions.

I didn't say that it would; we have no information on the INSERT issue,
so I'm just concentrating on the problem that he did provide info on.

(BTW, I suppose the slow-\d issue is the regex planning problem we
already knew about.)

I'm frankly not real surprised that there are performance issues with
such a huge pg_class; it's not a regime that anyone's spent any time
optimizing.  It is interesting that 8.2 seems to have regressed but
I can think of several places that would've been bad before.  One is
that there are seqscans of pg_inherits ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Jim C. Nasby
On Thu, Jan 11, 2007 at 04:49:28PM -0300, Alvaro Herrera wrote:
 Tom Lane wrote:
 
  What I think we need to do about this is
  
  (1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
  of using a hash table for the OIDs instead of a linear list.  Should be
  a pretty small change; I'll work on it today.
  
  (2) Reconsider whether last-vacuum-time should be sent to the collector
  unconditionally.
 
 (2) seems a perfectly reasonably answer, but ISTM (1) would be good to
 have anyway (at least in HEAD).

Actually, I'd rather see the impact #1 has before adding #2... If #1
means we're good for even someone with 10M relations, I don't see much
point in #2.

BTW, we're now starting to see more users with a large number of
relations, thanks to partitioning. It would probably be wise to expand
test coverage for that case, especially when it comes to performance.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Tom Lane
I wrote:
 (2) Reconsider whether last-vacuum-time should be sent to the collector
 unconditionally.

Actually, now that I look, the collector already contains this logic:

/*
 * Don't create either the database or table entry if it doesn't already
 * exist.  This avoids bloating the stats with entries for stuff that is
 * only touched by vacuum and not by live operations.
 */

and ditto for analyze messages.  So my idea that the addition of
last-vac-time was causing an increase in the statistics file size
compared to 8.1 seems wrong.

How large is your $PGDATA/global/pgstat.stat file, anyway?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Kim
Our pgstats.stat file is 40M for 8.2, on 8.1 it was 33M. Our schema size 
hasn't grown *that* much in the two weeks since we upgraded


I'm not sure if this sheds any more light on the situation, but in 
scanning down through the process output from truss, it looks like the 
first section of output was a large chunk of reads on pgstat.stat, 
followed by a larger chunk of reads on the global directory and 
directories under base - this whole section probably went on for a good 
6-7 minutes, though I would say the reads on pgstat likely finished 
within a couple of minutes or so. Following this there was a phase were 
it did a lot of seeks and reads on files under pg_clog, and it was while 
doing this (or perhaps it had finished whatever it wanted with clogs) it 
dropped into the send()/SIGUSR1 loop that goes for another several minutes.


Kim


Tom Lane wrote:


I wrote:
 


(2) Reconsider whether last-vacuum-time should be sent to the collector
unconditionally.
   



Actually, now that I look, the collector already contains this logic:

   /*
* Don't create either the database or table entry if it doesn't already
* exist.  This avoids bloating the stats with entries for stuff that is
* only touched by vacuum and not by live operations.
*/

and ditto for analyze messages.  So my idea that the addition of
last-vac-time was causing an increase in the statistics file size
compared to 8.1 seems wrong.

How large is your $PGDATA/global/pgstat.stat file, anyway?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

 



Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2upgrade

2007-01-11 Thread Simon Riggs
On Thu, 2007-01-11 at 16:11 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  It's not clear to me how this fix will alter the INSERT issue Kim
  mentions.
 
 I didn't say that it would; we have no information on the INSERT issue,
 so I'm just concentrating on the problem that he did provide info on.

OK.

 I'm frankly not real surprised that there are performance issues with
 such a huge pg_class; it's not a regime that anyone's spent any time
 optimizing. 

Yeh, I saw a pg_class that big once, but it just needed a VACUUM.

Temp relations still make pg_class entried don't they? Is that on the
TODO list to change?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Alvaro Herrera
Joshua D. Drake wrote:

  Please don't. At least not on the PostgreSQL web site nor in the docs. 
  And no, I don't run my production servers on Windows either.
 
 It does seem like it might be a good idea to have FAQs based on each OS,
 yes? There are various things that effect each OS differently. The most
 obvious to me being shared memory and wal_sync_method.

But we have per-platform FAQs.  If there is information missing, the
reason is that nobody has submitted an appropriate patch, nothing more.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread jam
On Thu, Jan 11, 2007 at 09:42:38PM -0300, Alvaro Herrera wrote:
 
 But we have per-platform FAQs.  If there is information missing, the
 reason is that nobody has submitted an appropriate patch, nothing more.
 

where are these FAQs, and why were they not easily found when the original
poster sent his email? is there some SEO we need to do on the websites to
make things more obvious?

regards,
J



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Joshua D. Drake
On Thu, 2007-01-11 at 21:42 -0300, Alvaro Herrera wrote:
 Joshua D. Drake wrote:
 
   Please don't. At least not on the PostgreSQL web site nor in the docs. 
   And no, I don't run my production servers on Windows either.
  
  It does seem like it might be a good idea to have FAQs based on each OS,
  yes? There are various things that effect each OS differently. The most
  obvious to me being shared memory and wal_sync_method.
 
 But we have per-platform FAQs.  If there is information missing, the
 reason is that nobody has submitted an appropriate patch, nothing more.

Yes you are correct, now that I look. It is not obviously apparent
though and they do appear to be quite out of date.

Joshua D. Drake




 
-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] NaN behavior

2007-01-11 Thread Neil Conway
postgres=# select 'NaN'::numeric = 'NaN'::numeric,
  'NaN'::float8 = 'NaN'::float8;
 ?column? | ?column? 
--+--
 t| t
(1 row)

This behavior is inconsistent with most people's notion of NaN -- in
particular, it is inconsistent with IEEE754. I can understand why
Postgres behaves this way, and we probably can't easily change it (if we
want to continue indexing NaN values, that is), but I think it should at
least be discussed in the documentation.

Comments? I'll write up a doc patch, barring any objections.

-Neil



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PANIC: block 463 unfound during REDO after out of

2007-01-11 Thread Christopher Kings-Lynne

Btw -unfound?? I think the English there might need to be improved :)

Chris

On 1/11/07, Richard Huxton dev@archonet.com wrote:

Warren Guy wrote:
 Hi everyone

 Was running a VACUUM on a database on a partition which was running out
 of disk space. During VACUUM the server process died and failed to restart.

 Running PostgreSQL 8.1.4

...
 Jan 11 15:02:39 marshall postgres[73909]: [5-1] FATAL:  the database
 system is starting up
 Jan 11 15:02:40 marshall postgres[73888]: [12-1] PANIC:  block 463 unfound
 Jan 11 15:02:41 marshall postgres[67977]: [5-1] LOG:  startup process
 (PID 73888) was terminated by signal 6
 Jan 11 15:02:41 marshall postgres[67977]: [6-1] LOG:  aborting startup
 due to startup process failure

You say was running out of disk space - does that mean it did run out
of disk space? I don't see the error that caused this, just the results.
That would suggest to me that something unusual caused this (or you
clipped the log fragment too far down :-)

In any case, the first thing I'd try is to make your on-disk backups and
set it up as though it's PITR recovery you're doing. That way you can
stop the recovery before block 463 causes the failure. Oh, assuming
you've got the space you need on your partition of course.

HTH
--
   Richard Huxton
   Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




--
Chris Kings-Lynne
Director
KKL Pty. Ltd.

Biz: +61 8 9328 4780
Mob: +61 (0)409 294078
Web: www.kkl.com.au

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] share info between backends

2007-01-11 Thread Jaime Casanova

On 1/11/07, Andrew Dunstan [EMAIL PROTECTED] wrote:

Jaime Casanova wrote:

 i'm trying to fix a problem related to the patch Albert sent in
 october (Tablespace for temporary objects and sort files)
 http://archives.postgresql.org/pgsql-patches/2006-10/msg00141.php
 http://archives.postgresql.org/pgsql-patches/2007-01/msg00063.php

 after reviewing this i notice this will not use different tablespaces
 in the case of having various connections all with just one temp
 table:
 http://archives.postgresql.org/pgsql-patches/2007-01/msg00188.php

 what i'm trying to do it is share what's the next_temp_tablespace to
 use...
 the other way to do this is as you say using tables, but i don't think
 this is a good idea in this case...

 comments?


Why not make it probabilistic by using, say, MyProcPid % n where n is the
number of tablespaces? Then you don't need anything shared.



mmm... is not great to try to kill flies with tanks? ;)
thanks for the idea, i will try it

--
regards,
Jaime Casanova

Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning.
  Richard Cook

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Problem linking libecpg.5.3.dylib on OS X

2007-01-11 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 I'm seeing the following on cuckoo:
 -L/opt/local/lib -lpgtypes -lpq -lm  -o libecpg.5.3.dylib
 ld: Undefined symbols:
 _ecpg_internal_regression_mode
 /usr/bin/libtool: internal link edit command failed

It looks like Joachim's last patch thinks it's OK for libecpg to try to
reference a variable declared by the calling program.  This will
surely Not Work everywhere, and even if it did work it'd be a bad idea
because it would guarantee that existing calling programs would break at
the next libecpg update.  The reference has to go the other way.

Actually I'd suggest that using an exported variable at all is probably
bad style.  I'd suggest that libecpg export a set() function instead:

static int ecpg_regression_mode = 0;

void ecpg_set_regression_mode(int mode) {
ecpg_regression_mode = mode;
}

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-11 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 On Thu, Jan 11, 2007 at 03:14:37PM -0500, Tom Lane wrote:
 ... And anyway there should never
 *be* a real permissions problem; if there is then the user's been poking
 under the hood sufficient to void the warranty anyway ;-)

 Or some other helpful process such as a virus scanner has been poking
 under the hood for you... :(

One point worth making is that I'm not really convinced anymore that
we have proof that antivirus code has been creating any such problems.
We have several anecdotal cases where someone reported erratic
permission denied problems on Windows, and we suggested getting rid
of any AV code, and it seemed to fix their problem --- but how long did
they test?  This problem is inherently very timing-sensitive, and so the
fact that you don't see it for a little while is hardly proof that it's
gone.  See the report that started this thread for examples of apparent
correlations that are really quite spurious, like whether the test case
is being driven locally or not.  It could easy be that every report
we've heard really traces to the not-yet-deleted-file problem.

So basically what we'd have is that if you manually remove permissions
on a database file or directory you'd be risking data loss; but heck,
if you manually move, rename, delete such a file you're risking
(guaranteeing) data loss.  Any sane user is going to figure keep your
fingers away from the moving parts; or if he can't figure that out,
he's got no one but himself to blame.

It's not ideal, granted, but we're dealing with a much-less-than-ideal
OS, so we gotta make some compromises.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2upgrade

2007-01-11 Thread Alvaro Herrera
Simon Riggs wrote:

 Temp relations still make pg_class entried don't they? Is that on the
 TODO list to change?

Yeah, and pg_attribute entries as well, which may be more problematic
because they are a lot.  Did we get rid of pg_attribute entries for
system attributes already?

Can we actually get rid of pg_class entries for temp tables.  Maybe
creating a temp pg_class which would be local to each session?  Heck,
it doesn't even have to be an actual table -- it just needs to be
somewhere from where we can load entries into the relcache.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2upgrade

2007-01-11 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Can we actually get rid of pg_class entries for temp tables.  Maybe
 creating a temp pg_class which would be local to each session?  Heck,
 it doesn't even have to be an actual table -- it just needs to be
 somewhere from where we can load entries into the relcache.

A few things to think about:

1. You'll break a whole lotta client-side code if temp tables disappear
from pg_class.  This is probably solvable --- one thought is to give
pg_class an inheritance child that is a view on a SRF that reads out the
stored-in-memory rows for temp pg_class entries.  Likewise for
pg_attribute and everything else related to a table definition.

2. How do you keep the OIDs for temp tables (and their associated
rowtypes) from conflicting with OIDs for real tables?  Given the way
that OID generation works, there wouldn't be any real problem unless a
temp table survived for as long as it takes the OID counter to wrap all
the way around --- but in a database that has WITH OIDS user tables,
that might not be impossibly long ...

3. What about dependencies on user-defined types, functions, etc?
How will you get things to behave sanely if one backend tries to drop a
type that some other backend is using in a column of a temp table?  Even
if you put entries into pg_depend, which would kind of defeat the point
of not having on-disk catalog entries for temp tables, I don't see how
the other backend figures out what the referencing object is.

I don't really see any solution to that last point :-(

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly