Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Simon Riggs
On Mon, 2006-10-23 at 15:12 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  On Mon, 2006-10-23 at 13:52 -0400, Tom Lane wrote:
  No can do --- we rely on the checksums to be able to tell when we've hit
  the end of WAL during replay.  
 
  No we don't: Zero length records are the trigger for EOF.
 
 Only if the file happens to be all-zero already, which is not the normal
 operating state (see WAL-file recycling).  Otherwise you have to be able
 to detect an invalid record.

OK.

 There are actually three checks used to detect end of WAL: zero record
 length, invalid checksum, and incorrect back-pointer.  Zero length is
 the first and cleanest-looking test, but AFAICS we have to have both of
 the others to avoid obvious failure modes.

Well, I understand the need for the zero-length test and the incorrect
back pointer (which also incidentally tests that the current record was
not left over from previous use of xlog file).

The checksum protects from torn pages and disk errors. If you have
full_page_writes set then you already believe yourself safe from torn
pages and your disks could also already be CRC-checking the data. So you
don't *need* the checksum in those cases. If we really think we need it
we could put the xlprev pointer as the *last* field on the xlrec, just
to make doubly sure - having a trailer as well as a header.

CRC-checked disks are actually the industry norm and have been for
around 5 years. ANSI SCSI Parallel Interface 3 (SPI-3) (UltraSCSI 160)
defines the use of CRC and this is available as standard from all key
manufacturers. (CRC-32 is the required level). So if you are using
modern SCSI, SATA or SAS technologies you'll be just fine. (Those checks
are a *requirement* of the underlying technology because of the error
rates when the bus speeds are so high.)

I checked this out in a conversation with the fine people at Seagate,
who also publish a variety of technical manuals/details on this, e.g.
http://www.maxtor.com/_files/maxtor/en_us/documentation/white_papers_technical/wp_ultra320.pdf
...You'll see that CRC is a Mandatory Feature of ANSI SPI-4 now.
http://www.maxtor.com/_files/maxtor/en_us/documentation/white_papers_technical/sas_link_layer.pdf

So, I'd like the *option* to turn our CRC off, please, somehow.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum

2006-10-24 Thread Hitoshi Harada
 If the decision to vacuum based on autovacuum criteria is good enough
 for you then I think you should just focus on getting autovac to do what
 you want it to do. Perhaps you just need to decrease the sleep time to a
 few seconds, so that autovac will quickly detect when something needs to
 be vacuumed.

Thanks, I'll do it.
My database is updated frequently all the day and 
runs big building process a day.
Almost all the day autovac is ok but
in the big building process autovac annoys it, 
so I wished there might be the way to order autovac to do its process.


Hitoshi Harada


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Jim C. Nasby
 Sent: Tuesday, October 24, 2006 3:36 AM
 To: Hitoshi Harada
 Cc: 'Tom Lane'; 'Peter Eisentraut'; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
 
 If the decision to vacuum based on autovacuum criteria is good enough
 for you then I think you should just focus on getting autovac to do what
 you want it to do. Perhaps you just need to decrease the sleep time to a
 few seconds, so that autovac will quickly detect when something needs to
 be vacuumed.
 
 The only case I can think of where autovac might not work as well as
 smartvacuum would be if you had a lot of databases in the cluster, since
 autovacuum will only vacuum one database at a time.
 
 On Mon, Oct 23, 2006 at 11:18:39AM +0900, Hitoshi Harada wrote:
  Ok,
 
  But my point is, autovacuum may corrupt with vacuum analyze command
  on another session. My intention of smartvacuum() is based on this.
  Any solution for this??
 
  Regards,
 
 
  Hitoshi Harada
 
   -Original Message-
   From: [EMAIL PROTECTED]
   [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
   Sent: Monday, October 23, 2006 11:10 AM
   To: Hitoshi Harada
   Cc: 'Peter Eisentraut'; pgsql-hackers@postgresql.org
   Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
  
   Hitoshi Harada [EMAIL PROTECTED] writes:
How is this different from what autovacuum does?
  
My application needs to do vacuum by itself, while
autovacuum does it as daemon.
The database is updated so frequently that
normal vacuum costs too much and tables to be updated are
not so many as the whole database is vacuumed.
I want to use autovacuum except the feature of daemon,
but want to control when to vacuum and which table to vacuum.
So, nothing is different between autovacuum and smartvacuum(),
but former is daemon and later is user function.
  
   This seems completely unconvincing.  What are you going to do that
   couldn't be done by autovacuum?
  
 regards, tom lane
  
   ---(end of
broadcast)---
   TIP 5: don't forget to increase your free space map settings
 
 
  ---(end of broadcast)---
  TIP 5: don't forget to increase your free space map settings
 
 
 --
 Jim Nasby[EMAIL PROTECTED]
 EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] xlogdump fixups and WAL log question.

2006-10-24 Thread Markus Schaber
Hi, Simon,

Simon Riggs wrote:

 1. Provide a filter that can be easily used by archive_command to remove
 full page writes from WAL files. This would require us to disable the
 file size test when we begin recovery on a new WAL files, plus would
 need to redesign initial location of the checkpoint record since we
 could no longer rely on the XLogRecPtr being a byte offset within the
 file.

pg_WAL_filter could correct the XLogRecPtr and file sizes during the
filter run.

HTH,
Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-24 Thread Zdenek Kotala

Tom Lane wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:

Hmm, maybe it should be using regprocedure instead?


Not unless you want to break initdb.  The only reason regproc still
exists, really, is to accommodate loading of pg_type during initdb.
Guess what: we can't do type lookup at that point.



I prepared patch which use oid output function instead regproc output. 
This change works only for COPY TO command. SELECT behavior is 
untouched. I extended copy regression test as well.


Please, look on it if it is acceptable fix.

With regards Zdenek
Index: src/backend/commands/copy.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.271
diff -c -r1.271 copy.c
*** src/backend/commands/copy.c	31 Aug 2006 03:17:50 -	1.271
--- src/backend/commands/copy.c	24 Oct 2006 12:35:45 -
***
*** 1309,1315 
  	out_func_oid,
  	isvarlena);
  		else
! 			getTypeOutputInfo(attr[attnum - 1]-atttypid,
  			  out_func_oid,
  			  isvarlena);
  		fmgr_info(out_func_oid, cstate-out_functions[attnum - 1]);
--- 1309,1317 
  	out_func_oid,
  	isvarlena);
  		else
! 			/* For regproc datatype do not lookup proc name, use OID out function instead.
! 			   It avoids problem with COPY FROM. */ 
! 			getTypeOutputInfo(attr[attnum - 1]-atttypid == REGPROCOID? OIDOID : attr[attnum - 1]-atttypid,
  			  out_func_oid,
  			  isvarlena);
  		fmgr_info(out_func_oid, cstate-out_functions[attnum - 1]);
Index: src/test/regress/input/copy.source
===
RCS file: /projects/cvsroot/pgsql/src/test/regress/input/copy.source,v
retrieving revision 1.14
diff -c -r1.14 copy.source
*** src/test/regress/input/copy.source	2 May 2006 11:28:56 -	1.14
--- src/test/regress/input/copy.source	24 Oct 2006 12:35:46 -
***
*** 105,107 
--- 105,113 
  
  copy copytest3 to stdout csv header;
  
+ --- test correct handling regproc data type
+ CREATE TEMP TABLE test_regproc (like pg_aggregate);
+ COPY pg_catalog.pg_aggregate TO '@abs_builddir@/results/test_regproc.data';
+ COPY test_regproc FROM '@abs_builddir@/results/test_regproc.data';
+ 
+ select aggfnoid, cast(aggfnoid as oid) from pg_aggregate where aggfnoid=2147;
Index: src/test/regress/output/copy.source
===
RCS file: /projects/cvsroot/pgsql/src/test/regress/output/copy.source,v
retrieving revision 1.12
diff -c -r1.12 copy.source
*** src/test/regress/output/copy.source	2 May 2006 11:28:56 -	1.12
--- src/test/regress/output/copy.source	24 Oct 2006 12:35:46 -
***
*** 70,72 
--- 70,82 
  c1,col with , comma,col with  quote
  1,a,1
  2,b,2
+ --- test correct handling regproc data type
+ CREATE TEMP TABLE test_regproc (like pg_aggregate);
+ COPY pg_catalog.pg_aggregate TO '@abs_builddir@/results/test_regproc.data';
+ COPY test_regproc FROM '@abs_builddir@/results/test_regproc.data';
+ select aggfnoid, cast(aggfnoid as oid) from pg_aggregate where aggfnoid=2147;
+  aggfnoid | aggfnoid
+ --+--
+  pg_catalog.count | 2147
+ (1 row)
+ 

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Mon, 2006-10-23 at 15:12 -0400, Tom Lane wrote:
 There are actually three checks used to detect end of WAL: zero record
 length, invalid checksum, and incorrect back-pointer.  Zero length is
 the first and cleanest-looking test, but AFAICS we have to have both of
 the others to avoid obvious failure modes.

 The checksum protects from torn pages and disk errors. If you have
 full_page_writes set then you already believe yourself safe from torn
 pages and your disks could also already be CRC-checking the data.

No, because unlike tuples, WAL records can and do cross page boundaries.
Unless you're prepared to posit that your system never crashes at all,
you have to be able to detect the case where you've got a good front
half of a WAL record and non-matching data in the next page.  The two
halves could be written out in widely separated write()s, indeed might
never have been simultaneously resident in the WAL buffers at all.

 CRC-checked disks are actually the industry norm and have been for
 around 5 years.

Huh?  Disks have ALWAYS had CRCs, and this is in any case utterly
irrelevant to the database-crash risk.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-24 Thread Tom Lane
Zdenek Kotala [EMAIL PROTECTED] writes:
 I prepared patch which use oid output function instead regproc output. 
 This change works only for COPY TO command.

This is not a bug and we're not going to fix it, most especially not
like that.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Dave Page
 

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED] 
 Sent: 24 October 2006 14:30
 To: Dave Page
 Cc: [EMAIL PROTECTED]
 Subject: Re: [CORE] Schedule for release? 
 
  In pgAdmin we have a simple bootstrap script the writes all 
 the version
  numbers into a bunch of files, and then runs 
 automake/autoconf et al. It
  sucks it out of one header which is all we have to modify manually.
  Any reason we can't do something similar for PosgreSQL?
 
 The pg_config.h.win32 file is intended to support building in an
 environment where you can't run automake/autoconf, or indeed much of
 anything else. 

That doesn't matter does it? Marc runs the bootstrap, which inserts the
version numbers into the right place and runs autoconf, then he commits
the changed files (configure, pg_config.h.win32 etc) to CVS. Only he (or
you or Bruce) should ever need to run it.

 Perhaps we could require Perl or something to be
 available?  I know Magnus is requiring Perl for his MSVC 
 build support.

Yes.

 This is getting offtopic for -core though.

Moved to -hackers...

Regards, Dave.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Tom Lane
Dave Page dpage@vale-housing.co.uk writes:
 From: Tom Lane [mailto:[EMAIL PROTECTED] 
 The pg_config.h.win32 file is intended to support building in an
 environment where you can't run automake/autoconf, or indeed much of
 anything else. 

 That doesn't matter does it? Marc runs the bootstrap, which inserts the
 version numbers into the right place and runs autoconf, then he commits
 the changed files (configure, pg_config.h.win32 etc) to CVS. Only he (or
 you or Bruce) should ever need to run it.

Hmm, so manufacture pg_config.h.win32 during tarball build and insert
the version numbers at that point?  Yeah, that would work.  Actually the
easiest thing would likely be to have configure build it the same way it
builds pg_config.h, and then not remove it in make distclean.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Simon Riggs
On Tue, 2006-10-24 at 09:37 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  On Mon, 2006-10-23 at 15:12 -0400, Tom Lane wrote:
  There are actually three checks used to detect end of WAL: zero record
  length, invalid checksum, and incorrect back-pointer.  Zero length is
  the first and cleanest-looking test, but AFAICS we have to have both of
  the others to avoid obvious failure modes.
 
  The checksum protects from torn pages and disk errors. If you have
  full_page_writes set then you already believe yourself safe from torn
  pages and your disks could also already be CRC-checking the data.
 
 No, because unlike tuples, WAL records can and do cross page boundaries.

But not that often, with full_page_writes = off. So we could get away
with just CRC checking the page-spanning ones and mark the records to
show whether they have been CRC checked or not and need to be rechecked
at recovery time. That would reduce the CRC overhead to about 1-5% of
what it is now (as an option).

Just a thought: Would there be benefit in not allowing page-spanning WAL
records, when they are small? That would tend to reduce the number of
WAL writes, even if it did cause some space wastage on disk. That would
reduce the number of same block re-writes and might improve the
sequential behaviour of WAL access. We never needed to think about this
while full_page_writes=on was the only option.

  CRC-checked disks are actually the industry norm and have been for
  around 5 years.
 
 Huh?  Disks have ALWAYS had CRCs, and this is in any case utterly
 irrelevant to the database-crash risk.

According to the man from Seagate...

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] WAL logging freezing

2006-10-24 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 I would've liked to give freezing a new opcode, 
   but we've ran out of them (see htup.h).

Hardly ... we have plenty of unused rmgr id's still.

The real issue that still has to be resolved is the interaction of all
this stuff with PITR scenarios --- is it still safe to not log hint-bit
updates when PITR is on?  I think it's premature to start writing
patches until we've decided how this really needs to work.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread mark
On Tue, Oct 24, 2006 at 02:52:36PM +0100, Simon Riggs wrote:
 On Tue, 2006-10-24 at 09:37 -0400, Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   On Mon, 2006-10-23 at 15:12 -0400, Tom Lane wrote:
   There are actually three checks used to detect end of WAL: zero record
   length, invalid checksum, and incorrect back-pointer.  Zero length is
   the first and cleanest-looking test, but AFAICS we have to have both of
   the others to avoid obvious failure modes.
   The checksum protects from torn pages and disk errors. If you have
   full_page_writes set then you already believe yourself safe from torn
   pages and your disks could also already be CRC-checking the data.
  No, because unlike tuples, WAL records can and do cross page boundaries.

 But not that often, with full_page_writes = off. So we could get away
 with just CRC checking the page-spanning ones and mark the records to
 show whether they have been CRC checked or not and need to be rechecked
 at recovery time. That would reduce the CRC overhead to about 1-5% of
 what it is now (as an option).

WAL pages 8 Kbytes, and disk pages 512 bytes, correct? I don't see a
guarantee in here that the 8 Kbytes worth of data will be written as
sequential writes, nor that the 8 Kbytes of data will necessarily
finish.

If the operating system uses 8 Kbyte pages, or the RAID system uses 8
Kbytes or larger chunks, and they guarantee sequential writes, perhaps
it is ok. Still, if the power goes out after writing the first 512
bytes, 2048 bytes, or 4096 bytes, then what? With RAID involved it
might get better or worse, depending on the RAID configuration.

I'm almost wondering whether the three numbers are enough. I'm too busy
to sketch it all down and predict failure points... :-)

 Just a thought: Would there be benefit in not allowing page-spanning WAL
 records, when they are small? That would tend to reduce the number of
 WAL writes, even if it did cause some space wastage on disk. That would
 reduce the number of same block re-writes and might improve the
 sequential behaviour of WAL access. We never needed to think about this
 while full_page_writes=on was the only option.

Probably. Might not be much though.

   CRC-checked disks are actually the industry norm and have been for
   around 5 years.
  Huh?  Disks have ALWAYS had CRCs, and this is in any case utterly
  irrelevant to the database-crash risk.
 According to the man from Seagate...

Once upon a time when bits were stored the size of smarties (exaggeration)
Back in those days (before my time), they liked to use things like parity.

I didn't read the page, but perhaps there is some confusion between
CRC and error correction codes. Obviously, technologies were
introduced over time. I don't remember ever having a hard disk that
didn't have some sort of error detection. The 5 year claim seems
decades too short unless they are talking about a newer technology.
Even the old 5.25 DOS floppies seemed to be able to detect errors
rather than return invalid corrupted bits.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Peter Eisentraut
Tom Lane wrote:
 Hmm, so manufacture pg_config.h.win32 during tarball build and insert
 the version numbers at that point?  Yeah, that would work.  Actually
 the easiest thing would likely be to have configure build it the same
 way it builds pg_config.h, and then not remove it in make
 distclean.

The libpq makefile contains ample precedent for preparing Windows 
specific files during tarball building.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Tom Lane
[EMAIL PROTECTED] writes:
 The 5 year claim seems
 decades too short unless they are talking about a newer technology.

I think what Simon is on about is CRCs being routinely used on the cable
between the disk drive and the CPU.  When I was involved in this stuff
you usually only got a parity bit on each byte transferred.  CRCs for
all disk command/result messages would definitely help close a risk area
--- but that's only one risk of many.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Magnus Hagander
  The pg_config.h.win32 file is intended to support building in an 
  environment where you can't run automake/autoconf, or 
 indeed much of 
  anything else.
 
  That doesn't matter does it? Marc runs the bootstrap, which inserts 
  the version numbers into the right place and runs autoconf, then he 
  commits the changed files (configure, pg_config.h.win32 
 etc) to CVS. 
  Only he (or you or Bruce) should ever need to run it.
 
 Hmm, so manufacture pg_config.h.win32 during tarball build 
 and insert the version numbers at that point?  Yeah, that 
 would work.  Actually the easiest thing would likely be to 
 have configure build it the same way it builds pg_config.h, 
 and then not remove it in make distclean.

Getting late into this discussion, so I may be completely off here :-)
How's that going to work+ pg_config.h.win32 needs to know win32 platform
specifics, right? So it has to be created, in that case, on win32. But
when you're building with MSVC, you don't run configure, because windows
can't run that (without the mingw layer).

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Conference materials (Was: [HACKERS] pdfs of the conference)

2006-10-24 Thread Andrew Sullivan
On Tue, Sep 19, 2006 at 12:29:18PM -0400, Andrew Sullivan wrote:
 The slides, all the photos, and even the audio are, I've been
 assured, going to get cleared up in the next few days. 

Well, those were some very long days, but it seems a good time to
note that the slides and audio (all that we have) are now on the
conference site, linked to the conference programme from this summer:

http://conference.postgresql.org/Program
 
Thanks very much to Gavin Sherry, who did a great deal of the work on
this.  If your materials are not there, it's because we don't have
them.  If you send them to me, I'll put them in place as soon as I
receive them.  Really, I will.  My TODO list doesn't need to get
longer.

I have the unhappy duty of announcing, too, that the audio recordings
from the second room at the conference were lost, along with the rest
of the things on the laptop where they were stored.  

Thanks very much to all the speakers and attendees who made the
conference a success.

Best,
Andrew


-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The whole tendency of modern prose is away from concreteness.
--George Orwell

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Dave Page
 

 -Original Message-
 From: Magnus Hagander [mailto:[EMAIL PROTECTED] 
 Sent: 24 October 2006 15:56
 To: Tom Lane; Dave Page
 Cc: pgsql-hackers@postgresql.org
 Subject: RE: [HACKERS] Release stamping (Was: [CORE] Schedule 
 for release?) 
 
 Getting late into this discussion, so I may be completely off here :-)
 How's that going to work+ pg_config.h.win32 needs to know 
 win32 platform
 specifics, right? So it has to be created, in that case, on win32. But
 when you're building with MSVC, you don't run configure, 
 because windows
 can't run that (without the mingw layer).

Sorry - we're just talking about getting the version number in there
automatically to avoid it getting forgotten during release bundling.

Regards, Dave.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Magnus Hagander
  Getting late into this discussion, so I may be completely 
 off here :-) 
  How's that going to work+ pg_config.h.win32 needs to know
  win32 platform
  specifics, right? So it has to be created, in that case, on 
 win32. But 
  when you're building with MSVC, you don't run configure, because 
  windows can't run that (without the mingw layer).
 
 Sorry - we're just talking about getting the version number 
 in there automatically to avoid it getting forgotten during 
 release bundling.

I can see that being a good idea. But I don't see Toms ./configure
solution working.

What we could do is have the msvc build scripts edit the file and
replace the version with something it reads from configure.in when run.
This would require that we zap the old win32.mak method of buildnig
win32 stuff, which we can't do just yet but IMHO can eventually do. 

The other option is, I would think, to break out the version #defines
into a separate headerfile that's used on all platforms, and use that
one *instead* of configure to set it.

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Sorry - we're just talking about getting the version number 
 in there automatically to avoid it getting forgotten during 
 release bundling.

 I can see that being a good idea. But I don't see Toms ./configure
 solution working.

Why not?  The shipped tarball would contain exactly the same
pg_config.h.win32 it does today; the only difference is that the
version info would've been inserted automatically instead of manually.
(The start of this discussion was my observation that pg_config.h.win32
contains multiple copies of the version info, and sooner or later
somebody would miss one while stamping a release.)

 What we could do is have the msvc build scripts edit the file and
 replace the version with something it reads from configure.in when run.

That's great if you're using msvc, but what about borland?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Peter Eisentraut
Tom Lane wrote:
 Why not?  The shipped tarball would contain exactly the same
 pg_config.h.win32 it does today; the only difference is that the
 version info would've been inserted automatically instead of
 manually.

I suggest you do it in a makefile as part of the distprep target.

distprep: pg_config.h.win32

pg_config.h.win32: pg_config.h.win32.in
sed 's/@VERSION@/$(VERSION)/g' $ $@

maintainer-clean:
rm -f pg_config.h.win32


That way we don't bother every configure runner with the issue.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Simon Riggs
On Tue, 2006-10-24 at 10:18 -0400, [EMAIL PROTECTED] wrote:
 On Tue, Oct 24, 2006 at 02:52:36PM +0100, Simon Riggs wrote:
  On Tue, 2006-10-24 at 09:37 -0400, Tom Lane wrote:
   No, because unlike tuples, WAL records can and do cross page boundaries.
 
  But not that often, with full_page_writes = off. So we could get away
  with just CRC checking the page-spanning ones and mark the records to
  show whether they have been CRC checked or not and need to be rechecked
  at recovery time. That would reduce the CRC overhead to about 1-5% of
  what it is now (as an option).
 
 WAL pages 8 Kbytes, and disk pages 512 bytes, correct? I don't see a
 guarantee in here that the 8 Kbytes worth of data will be written as
 sequential writes, nor that the 8 Kbytes of data will necessarily
 finish.
 
 If the operating system uses 8 Kbyte pages, or the RAID system uses 8
 Kbytes or larger chunks, and they guarantee sequential writes, perhaps
 it is ok. Still, if the power goes out after writing the first 512
 bytes, 2048 bytes, or 4096 bytes, then what? With RAID involved it
 might get better or worse, depending on the RAID configuration.

That is the torn-page problem. If your system doesn't already protect
you against this you have no business turning off full_page_writes,
which was one of my starting assumptions.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Incorrect behavior with CE and ORDER BY

2006-10-24 Thread Joshua D. Drake
Hello,

We have a problem with CE that I want to verify is either expected
behavior, a bug or something else :).

Yes constraint exclusion is on.

I have tried increasing the default_statistics_target (all the way 1000)
no change in behavior.

Query plan with ORDER BY:

 Limit  (cost=47110.19..47110.31 rows=50 width=8) (actual
time=6088.013..6088.269 rows=50 loops=1)
   -  Sort  (cost=47110.19..47943.14 rows=333179 width=8) (actual
time=6088.007..6088.104 rows=50 loops=1)
 Sort Key: public.tbl_profile_search.pse_lastlogin
 -  Result  (cost=0.00..16547.78 rows=333179 width=8) (actual
time=0.020..4339.472 rows=334319 loops=1)
   -  Append  (cost=0.00..16547.78 rows=333179 width=8)
(actual time=0.016..3208.022 rows=334319 loops=1)
 -  Seq Scan on tbl_profile_search
(cost=0.00..2.27 rows=1 width=8) (actual time=0.012..0.047 rows=2 loops=1)
   Filter: (((pse_normalized_text)::text =
'1'::text) AND (pse_interest_type = 10))
 -  Index Scan using index_pse_09_on_part_1 on
tbl_profile_search_interest_1 tbl_profile_search  (cost=0.00..4.73
rows=1 width=8) (actual time=0.202..0.202 rows=0 loops=1)
   Index Cond: ((pse_normalized_text)::text =
'1'::text)
   Filter: (pse_interest_type = 10)
 -  Bitmap Heap Scan on
tbl_profile_search_interest_10 tbl_profile_search
(cost=3579.12..16540.78 rows=333177 width=8) (actual
time=90.619..2116.224 rows=334317 loops=1)
   Recheck Cond: ((pse_normalized_text)::text =
'1'::text)
   Filter: (pse_interest_type = 10)
   -  Bitmap Index Scan on
index_pse_09_on_part_10  (cost=0.00..3579.12 rows=333177 width=0)
(actual time=89.052..89.052 rows=340964 loops=1)
 Index Cond:
((pse_normalized_text)::text = '1'::text)
 Total runtime: 6103.190 ms


Same query, just removed ORDER BY:


---
 Limit  (cost=0.00..2.48 rows=50 width=4) (actual time=0.025..57.146
rows=50 loops=1)
   -  Result  (cost=0.00..16549.78 rows=333179 width=4) (actual
time=0.021..56.993 rows=50 loops=1)
 -  Append  (cost=0.00..16549.78 rows=333179 width=4) (actual
time=0.017..56.835 rows=50 loops=1)
   -  Seq Scan on tbl_profile_search  (cost=0.00..2.27
rows=1 width=4) (actual time=0.013..0.050 rows=2 loops=1)
 Filter: (((pse_normalized_text)::text = '1'::text)
AND (pse_interest_type = 10))
   -  Index Scan using index_pse_09_on_part_1 on
tbl_profile_search_interest_1 tbl_profile_search  (cost=0.00..4.73
rows=1 width=4) (actual time=0.051..0.051 rows=0 loops=1)
 Index Cond: ((pse_normalized_text)::text = '1'::text)
 Filter: (pse_interest_type = 10)
   -  Bitmap Heap Scan on tbl_profile_search_interest_10
tbl_profile_search  (cost=3581.12..16542.78 rows=333177 width=4) (actual
time=56.481..56.573 rows=48 loops=1)
 Recheck Cond: ((pse_normalized_text)::text = '1'::text)
 Filter: (pse_interest_type = 10)
 -  Bitmap Index Scan on index_pse_09_on_part_10
(cost=0.00..3581.12 rows=333177 width=0) (actual time=54.999..54.999
rows=341233 loops=1)
   Index Cond: ((pse_normalized_text)::text =
'1'::text)
 Total runtime: 57.396 ms


-- 

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread mark
On Tue, Oct 24, 2006 at 05:05:58PM +0100, Simon Riggs wrote:
 On Tue, 2006-10-24 at 10:18 -0400, [EMAIL PROTECTED] wrote:
  On Tue, Oct 24, 2006 at 02:52:36PM +0100, Simon Riggs wrote:
   On Tue, 2006-10-24 at 09:37 -0400, Tom Lane wrote:
No, because unlike tuples, WAL records can and do cross page boundaries.
   But not that often, with full_page_writes = off. So we could get away
   with just CRC checking the page-spanning ones and mark the records to
   show whether they have been CRC checked or not and need to be rechecked
   at recovery time. That would reduce the CRC overhead to about 1-5% of
   what it is now (as an option).
  
  WAL pages 8 Kbytes, and disk pages 512 bytes, correct? I don't see a
  guarantee in here that the 8 Kbytes worth of data will be written as
  sequential writes, nor that the 8 Kbytes of data will necessarily
  finish.
  
  If the operating system uses 8 Kbyte pages, or the RAID system uses 8
  Kbytes or larger chunks, and they guarantee sequential writes, perhaps
  it is ok. Still, if the power goes out after writing the first 512
  bytes, 2048 bytes, or 4096 bytes, then what? With RAID involved it
  might get better or worse, depending on the RAID configuration.
 
 That is the torn-page problem. If your system doesn't already protect
 you against this you have no business turning off full_page_writes,
 which was one of my starting assumptions.

I wasn't aware that a system could protect against this. :-)

I write 8 Kbytes - how can I guarantee that the underlying disk writes
all 8 Kbytes before it loses power? And why isn't the CRC a valid means
of dealing with this? :-)

I'm on wrong on one of these assumptions, I'm open to being educated.
My opinion as of a few seconds ago, is that a write to a single disk
sector is safe, but that a write that extends across several sectors
is not.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Incorrect behavior with CE and ORDER BY

2006-10-24 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes:
 We have a problem with CE that I want to verify is either expected
 behavior, a bug or something else :).

Uh, what's your problem exactly?  The example only seems to demonstrate
that if you don't ask for a sort, you don't get one.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Simon Riggs
On Tue, 2006-10-24 at 12:47 -0400, [EMAIL PROTECTED] wrote:
 On Tue, Oct 24, 2006 at 05:05:58PM +0100, Simon Riggs wrote:
  On Tue, 2006-10-24 at 10:18 -0400, [EMAIL PROTECTED] wrote:
   On Tue, Oct 24, 2006 at 02:52:36PM +0100, Simon Riggs wrote:
On Tue, 2006-10-24 at 09:37 -0400, Tom Lane wrote:
 No, because unlike tuples, WAL records can and do cross page 
 boundaries.
But not that often, with full_page_writes = off. So we could get away
with just CRC checking the page-spanning ones and mark the records to
show whether they have been CRC checked or not and need to be rechecked
at recovery time. That would reduce the CRC overhead to about 1-5% of
what it is now (as an option).
   
   WAL pages 8 Kbytes, and disk pages 512 bytes, correct? I don't see a
   guarantee in here that the 8 Kbytes worth of data will be written as
   sequential writes, nor that the 8 Kbytes of data will necessarily
   finish.
   
   If the operating system uses 8 Kbyte pages, or the RAID system uses 8
   Kbytes or larger chunks, and they guarantee sequential writes, perhaps
   it is ok. Still, if the power goes out after writing the first 512
   bytes, 2048 bytes, or 4096 bytes, then what? With RAID involved it
   might get better or worse, depending on the RAID configuration.
  
  That is the torn-page problem. If your system doesn't already protect
  you against this you have no business turning off full_page_writes,
  which was one of my starting assumptions.
 
 I wasn't aware that a system could protect against this. :-)
 
 I write 8 Kbytes - how can I guarantee that the underlying disk writes
 all 8 Kbytes before it loses power? And why isn't the CRC a valid means
 of dealing with this? :-)
 
 I'm on wrong on one of these assumptions, I'm open to being educated.
 My opinion as of a few seconds ago, is that a write to a single disk
 sector is safe, but that a write that extends across several sectors
 is not.

http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html

full_page_writes = off

I'm very happy to learn more myself...

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Incorrect behavior with CE and ORDER BY

2006-10-24 Thread Joshua D. Drake
Tom Lane wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
 We have a problem with CE that I want to verify is either expected
 behavior, a bug or something else :).
 
 Uh, what's your problem exactly?  The example only seems to demonstrate
 that if you don't ask for a sort, you don't get one.

Sorry. The problem is, if I ask for an ORDER BY it scans all partitions
versus only scanning the partition that has the data in it.

Sincerely,

Joshua D. Drake


 
   regards, tom lane
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Incorrect behavior with CE and ORDER BY

2006-10-24 Thread Alvaro Herrera
Joshua D. Drake wrote:
 Tom Lane wrote:
  Joshua D. Drake [EMAIL PROTECTED] writes:
  We have a problem with CE that I want to verify is either expected
  behavior, a bug or something else :).
  
  Uh, what's your problem exactly?  The example only seems to demonstrate
  that if you don't ask for a sort, you don't get one.
 
 Sorry. The problem is, if I ask for an ORDER BY it scans all partitions
 versus only scanning the partition that has the data in it.

Huh, but that's not what the EXPLAIN ANALYZE you posted says ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Simon Riggs
On Tue, 2006-10-24 at 14:52 +0100, Simon Riggs wrote:
 On Tue, 2006-10-24 at 09:37 -0400, Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   On Mon, 2006-10-23 at 15:12 -0400, Tom Lane wrote:
   There are actually three checks used to detect end of WAL: zero record
   length, invalid checksum, and incorrect back-pointer.  Zero length is
   the first and cleanest-looking test, but AFAICS we have to have both of
   the others to avoid obvious failure modes.
  
   The checksum protects from torn pages and disk errors. If you have
   full_page_writes set then you already believe yourself safe from torn
   pages and your disks could also already be CRC-checking the data.
  
  No, because unlike tuples, WAL records can and do cross page boundaries.
 
 But not that often, with full_page_writes = off. So we could get away
 with just CRC checking the page-spanning ones and mark the records to
 show whether they have been CRC checked or not and need to be rechecked
 at recovery time. That would reduce the CRC overhead to about 1-5% of
 what it is now (as an option).

Looking further, I see that the xlog page header already contains
xlp_pageaddr which is a XLogRecPtr. So an xlrec that tried to span
multiple pages yet failed in between would easily show up as a failure
in ValidXLOGHeader(), even before the CRC check. [The XLogRecPtr
contains both the offset within the file and a unique identification of
the WAL file, so any data left over from previous uses of that data file
will be easily recognised as such].

So we don't even need to CRC check the page-spanning ones either.

So it all comes down to: do you trust your hardware? I accept that some
hardware/OS combinations will give you high risk, others will reduce it
considerably. All I'm looking to do is to pass on the savings for people
that are confident they have invested wisely in hardware.

Does anybody have a reasonable objection to introducing an option for
people that are comfortable they are making the correct choice?
wal_checksum = off (or other suggested naming...)

I don't want to take blind risks, so shoot me down, please, if I err.
I'm happy either way: either we speed up, or we're safer not to.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Alvaro Herrera
[EMAIL PROTECTED] wrote:

 I'm on wrong on one of these assumptions, I'm open to being educated.
 My opinion as of a few seconds ago, is that a write to a single disk
 sector is safe, but that a write that extends across several sectors
 is not.

Unless it's fsync'ed, which is what we do at CHECKPOINT.  Keep in mind
that we save full page images on WAL the first time we touch the page
after a checkpoint.  This means that if a partial write occured, we will
restore it from WAL.

So it's not safe in general, but it is safe in Postgres.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Gregory Maxwell

On 10/24/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

I wasn't aware that a system could protect against this. :-)

I write 8 Kbytes - how can I guarantee that the underlying disk writes
all 8 Kbytes before it loses power? And why isn't the CRC a valid means
of dealing with this? :-)


[snip]

A file system with an apropreiate transaction method could do this..
In *theory* reiser4 write()s are atomic. No one has verified, however,
that there is no torn page risk introduced in some other part of the
kernel.

I'm not aware of any other system which can guaranteed the atomicity
of 8k writes.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Incorrect behavior with CE and ORDER BY

2006-10-24 Thread Joshua D. Drake
Alvaro Herrera wrote:
 Joshua D. Drake wrote:
 Tom Lane wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
 We have a problem with CE that I want to verify is either expected
 behavior, a bug or something else :).
 Uh, what's your problem exactly?  The example only seems to demonstrate
 that if you don't ask for a sort, you don't get one.
 Sorry. The problem is, if I ask for an ORDER BY it scans all partitions
 versus only scanning the partition that has the data in it.
 
 Huh, but that's not what the EXPLAIN ANALYZE you posted says ...
 

Sorry I realize the error of my ways. It isn't that it is scanning all
partitions, it is that it is scanning all of a single partition (subject
to the WHERE clause). That is correct behavior.

Sincerely,

Joshua D. Drake

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Tom Lane
Gregory Maxwell [EMAIL PROTECTED] writes:
 I'm not aware of any other system which can guaranteed the atomicity
 of 8k writes.

The reasoning for supporting full_page_writes = off is that if you have
a stable kernel and suitable backup power, battery backed write cache,
etc, your risk of a partially completed write() may be low enough to
be acceptable.  Obviously there are no 100.00% guarantees, but that's
what you keep backups for ...

Simon is essentially arguing that if we are willing to assume no
incomplete write() we may as well assume it for WAL too.  This seems
to me to be raising the risk significantly, but I admit that I can't
put my finger on why exactly.

One point not directly related to crash safety is whether CRC checking
isn't still going to be a good idea when PITR is enabled.  Archived WAL
files are going to have been through a great deal more transferring than
ones merely being used to recover from a crash.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Gurjeet Singh
Sorry for getting into the conversation so late... It was a long weekend in India.On 10/23/06, Tom Lane 
[EMAIL PROTECTED]
 wrote:I didn't particularly trust the timing calculations in your benchmark

program,  Any particular reason? (why and what did you doubt in it?). I designed the prog. to be flexible to test different sized blocks (to cost single/less INIT/COMP/FIN iterations), and different size lists of data (to control the number of iterations). Please share you wisdom.
 When I first saw your results, I had a strong feeling that function-call overhead was going against SB8. And then, Jeremy's trials, and subsequent success, on disabling loop optimizations also pointed to this possibility.
 So, I have taken your tests and converted the SB8 function calls into macros. And the results are (please note that crc = 0 is explained later):std_8192_noprintcrc.outcrc = 0, bufsize = 8192, loops = 100, elapsed = 
8.471994sb8_8192_noprintcrc.outcrc = 0, bufsize = 8192, loops = 100, elapsed = 0.06std_8192_printcrc.outcrc = 8228BB0E, bufsize = 8192, loops = 100, elapsed = 32.490704sb8_8192_printcrc.out
crc = 7E67A22A, bufsize = 8192, loops = 100, elapsed = 22.349156std_64_noprintcrc.outcrc = 0, bufsize = 64, loops = 100, elapsed = 0.151354sb8_64_noprintcrc.outcrc = 0, bufsize = 64, loops = 100, elapsed = 
0.05std_64_printcrc.outcrc = 9C9FBE2E, bufsize = 64, loops = 100, elapsed = 0.559315sb8_64_printcrc.outcrc = F70BC6AE, bufsize = 64, loops = 100, elapsed = 0.357382 The result names are in the format: algo_type_test_size_was_mycrc_referenced_in_printf.out
 crc = 0 in the result means that the mycrc variable was not refereced anywhere after the for-loop. As can be seen, if mycrc is not refrenced in the printf, that is, it's usage is limited to just inside the 'for' loop, then GCC (
4.1) seems to be optimizing the loop heavily. In the case of SB8, if mycrc is not referenced later, it seems to have totally removed the loop!!! The only difference between the x_noprintcrc and the x_printcrc tests was that in the printf() call, the first parameter after the format string was either a zero or mycrc variable, respectively.
 I am highly apprehensive that I might have made some mistake while converting function calls to macros; though, I have not besen able to prove it thus far. Please check it's validity as compared to the function-call version.
 If there's no mistake, then I think SB8 is back in the performance game now. These results were obtained with gcc 4.1 on FC5 running on Intel Pentium M 1.86 GHz, and OS starteted and running in runlevel 3.
 Please dump the .c and .h files from the attachment on top of Tom's package, and test it as earlier.Best regards,-- [EMAIL PROTECTED][EMAIL PROTECTED] gmail | hotmail | yahoo }.com



my-crctest.tar.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [DOCS] Replication documentation addition

2006-10-24 Thread Markus Schiltknecht

Hello Bruce,

Bruce Momjian wrote:

Here is a new replication documentation section I want to add for 8.2:

ftp://momjian.us/pub/postgresql/mypatches/replication

Comments welcomed.


Thank you, that sounds good. It's targeted to production use and 
currently available solutions, which makes sense in the official manual.


You are explaining the sync vs. async categorization, but I sort of 
asked myself where the explanation of single vs multi-master has gone. I 
then realized, that you are talking about read-only and a read/write 
mix of servers. Then again, you are mentioning 'Multi-Master 
Replication' as one type of replication solutions. I think we should be 
consistent in our naming. As Single- and Multi-Master are the more 
common terms among database replication experts, I'd recommend to use 
them and explain what they mean instead of introducing new names.


Along with that, I'd argue that this Single- or Multi-Master is a 
categorization as Sync vs Async. In that sense, the last chapter should 
probably be named 'Distributed-Shared-Memory Replication' or something 
like that instead of 'Multi-Master Replication', because as we know, 
there are several ways of doing Multi-Master Replication (Slony-II / 
Postgres-R, Distributed Shared Memory, 2PC in application code or the 
above mentioned 'Query Broadcast Replication', which would fall into a 
Multi-Master Replication model as well)


Also in the last chapter, instead of just saying that PostgreSQL does 
not offer this type of replication, we could probably say that 
different projects are trying to come up with better replication 
solutions. And there are several proprietary products based on 
PostgreSQL which do solve some kinds of Multi-Master Replication. Not 
that I want to advertise for any of them, but it just sounds better than 
the current no, we don't offer that.


As this documentation mainly covers production-quality solutions (which 
is absolutely perfect), can we document the status of current projects 
somewhere, probably in a wiki? Or at least mention them somewhere and 
point to their websites? It would help to get rid of all those rumors 
and uncertainties. Or are those intentional?


Just my two cents.

Regards

Markus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Hannu Krosing
Ühel kenal päeval, T, 2006-10-24 kell 00:20, kirjutas Bruce Momjian:
 Here is a new replication documentation section I want to add for 8.2:
 
   ftp://momjian.us/pub/postgresql/mypatches/replication

This is how data partitioning is currently described there

 Data Partitioning
 -
 
 Data partitioning splits the database into data sets.  To achieve
 replication, each data set can only be modified by one server.  For
 example, data can be partitioned by offices, e.g. London and Paris. 
 While London and Paris servers have all data records, only London can
 modify London records, and Paris can only modify Paris records.  Such
 partitioning is usually accomplished in application code, though rules
 and triggers can help enforce partitioning and keep the read-only data
 sets current.  Slony can also be used in such a setup.  While Slony
 replicates only entire tables, London and Paris can be placed in
 separate tables, and inheritance can be used to access from both tables
 using a single table name.

Maybe another use of partitioning should also be mentioned. That is ,
when partitioning is used to overcome limitations of single servers
(especially IO and memory, but also CPU), and only a subset of data is
stored and processed on each server.

As an example of this type of partitioning you could mention Bizgres MPP
(a PG-based commercial product, http://www.greenplum.com ), which
partitions data to use I/O and CPU of several DB servers for processing
complex OLAP queries, and Pl_Proxy
( http://pgfoundry.org/projects/plproxy/ ) which does the same for OLTP
loads.

I think the official term for this kind of replication is
Shared-Nothing Clustering.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Markus Schiltknecht

Hannu Krosing wrote:

I think the official term for this kind of replication is
Shared-Nothing Clustering.


Well, that's just another distinction for clusters. Most of the time 
it's between Shared-Disk vs. Shared-Nothing. You could also see the very 
Big Irons as a Shared-Everything Cluster.


While it's certainly true, that any kind of data partitioning for 
databases only make sense for Shared-Nothing Clusters, I don't think 
it's a 'kind of replication'. AFAIK most database replication solutions 
are built for Shared-Nothing Clusters. (With the exception of 
PgCluster-II, I think).


Regards

Markus




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [pgsql-advocacy] Conference materials (Was: [HACKERS] pdfs of

2006-10-24 Thread Teodor Sigaev

this.  If your materials are not there, it's because we don't have
them.  If you send them to me, I'll put them in place as soon as I
receive them.  Really, I will.  My TODO list doesn't need to get


Tsearch2 round table should have two presentations, one of them is already on 
site, another: http://www.sigaev.ru/gin/Gin.pdf

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Josh Berkus
Bruce,

 Here is my first draft of a new replication section for our
 documentation.  I am looking for any comments.

Hmmm ... while the primer on different types of replication is fine, I 
think what users were really looking for is a listing of the different 
replication solutions which are available for PostgreSQL and how to get 
them.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Incorrect behavior with CE and ORDER BY

2006-10-24 Thread Alvaro Herrera

I followed up with Joshua on Jabber.  This is the query:

SELECT pse_userid FROM tbl_profile_search WHERE pse_normalized_text='1'
and pse_interest_type = 10 order by pse_lastlogin DESC limit 50 offset 0

I suggested adding an index on (pse_normalized_text, pse_lastlogin), on
the assumption that the planner would get the sorted output from there
and be able to push the LIMIT clause, just below the indexscan, thus
saving the big heap scan (and a sort across a large result set).  But it
turns out the index is already there.

So it seems to me to be a planner shortcoming.  Is this correct?

My idea of the plan would be (tabs=8 spaces)

Limit (50)
  Sort (key: pse_lastlogin)
Result
   Append
  SeqScan tbl_profile_search
  Indexscan tbl_profile_search_interest_1
  Limit (50)
 IndexScan on the index mentioned above

Is this possible?  It would be very fast.  Maybe it should be like this
instead:

Limit (50)
  Sort (key: pse_lastlogin)
Result
   Append
  Limit (50)
 SeqScan tbl_profile_search
  Limit (50)
 Indexscan tbl_profile_search_interest_1
  Limit (50)
 IndexScan on the index mentioned above

This is the actual plan:

  Limit  (cost=47110.19..47110.31 rows=50 width=8) (actual
 time=6088.013..6088.269 rows=50 loops=1)
-  Sort  (cost=47110.19..47943.14 rows=333179 width=8) (actual
 time=6088.007..6088.104 rows=50 loops=1)
  Sort Key: public.tbl_profile_search.pse_lastlogin
  -  Result  (cost=0.00..16547.78 rows=333179 width=8) (actual
 time=0.020..4339.472 rows=334319 loops=1)
-  Append  (cost=0.00..16547.78 rows=333179 width=8)
 (actual time=0.016..3208.022 rows=334319 loops=1)
  -  Seq Scan on tbl_profile_search
 (cost=0.00..2.27 rows=1 width=8) (actual time=0.012..0.047 rows=2 loops=1)
Filter: (((pse_normalized_text)::text =
 '1'::text) AND (pse_interest_type = 10))
  -  Index Scan using index_pse_09_on_part_1 on
 tbl_profile_search_interest_1 tbl_profile_search  (cost=0.00..4.73
 rows=1 width=8) (actual time=0.202..0.202 rows=0 loops=1)
Index Cond: ((pse_normalized_text)::text =
 '1'::text)
Filter: (pse_interest_type = 10)
  -  Bitmap Heap Scan on
 tbl_profile_search_interest_10 tbl_profile_search
 (cost=3579.12..16540.78 rows=333177 width=8) (actual
 time=90.619..2116.224 rows=334317 loops=1)
Recheck Cond: ((pse_normalized_text)::text =
 '1'::text)
Filter: (pse_interest_type = 10)
-  Bitmap Index Scan on
 index_pse_09_on_part_10  (cost=0.00..3579.12 rows=333177 width=0)
 (actual time=89.052..89.052 rows=340964 loops=1)
  Index Cond:
 ((pse_normalized_text)::text = '1'::text)
  Total runtime: 6103.190 ms


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [pgsql-advocacy] Conference materials (Was: [HACKERS] pdfs of

2006-10-24 Thread Andrew Sullivan
On Tue, Oct 24, 2006 at 10:13:21PM +0400, Teodor Sigaev wrote:
 Tsearch2 round table should have two presentations, one of them is already 
 on site, another: http://www.sigaev.ru/gin/Gin.pdf

Added.  Thanks.

By the way, I should have mentioned that there are apparently more
audio files on the way, but we don't have them yet.  Someone is
reportedly doing some sort of audio magic to improve the sound.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The plural of anecdote is not data.
--Roger Brinner

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Markus Schiltknecht

Hello Josh,

Josh Berkus wrote:
Hmmm ... while the primer on different types of replication is fine, I 
think what users were really looking for is a listing of the different 
replication solutions which are available for PostgreSQL and how to get 
them.


Well, let's see what we have:

* Shared Disk Fail Over
* Warm Standby Using Point-In-Time Recovery
* Point-In-Time Recovery

these first three require quite some configuration, AFAIK there is no 
tool or single solution you can download, install and be happy with. I 
probably wouldn't even call them 'replication solutions'. For me those 
are more like backups with fail-over capability.



* Continuously Running Fail-Over Server

(BTW, what is 'partial replication' supposed to mean here?)
Here we could link to Slony.


* Data Partitioning

Here we can't provide a link, it's just a way to handle the problem in 
the application code.



* Query Broadcast Replication

Here we could link to PgPool.


* Multi-Master Replication
  (or better: Distributed Shared Memory Replication)

No existing solution for PostgreSQL.


Looking at that, I'm a) missing PgCluster and b) arguing that we have to 
admit that we simply can not 'list .. replication solutions ... and how 
to get them' because all of the solutions mentioned need quite some 
knowledge and require a more or less complex installation and configuration.


Regards

Markus



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Joshua D. Drake

 Looking at that, I'm a) missing PgCluster and b) arguing that we have to
 admit that we simply can not 'list .. replication solutions ... and how
 to get them' because all of the solutions mentioned need quite some
 knowledge and require a more or less complex installation and
 configuration.

There is also the question if we should have a sub section:

Closed Source replication solutions:

Mammoth Replicator
Continuent P/Cluster
ExtenDB
Greenplum MPP (although this is kind of horizontal partitioning)

Joshua D. Drake


 
 Regards
 
 Markus
 
 
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
   http://archives.postgresql.org
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Simon is essentially arguing that if we are willing to assume no
 incomplete write() we may as well assume it for WAL too.  This seems
 to me to be raising the risk significantly, but I admit that I can't
 put my finger on why exactly.

Actually I think we can deal with torn pages in the WAL more easily than in
database files anyways. In database files we need to get the entire page
correctly one way or the other so we need full_page_writes in order to be deal
properly. 

In the WAL we just need to be able to detect torn pages and stop reading WAL
at that point. That's easier and doesn't really need a CRC. We could just
adopt the Sybase strategy of storing a unique id number every 512 bytes
throughout the WAL page. If those numbers don't match then we have a torn
page; the system crashed at that point and we should stop reading WAL pages.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [SPAM?] Re: Asynchronous I/O Support

2006-10-24 Thread Ron Mayer
Zeugswetter Andreas ADI SD wrote:
 POSIX_FADV_WILLNEED definitely sounds very interesting, but:
 
 I think this interface was intended to hint larger areas (megabytes).
 But the wishful thinking was not to hint seq scans, but to advise
 single 8k pages.

Surely POSIX_FADV_SEQUENTIAL is the one intended to hint seq scans,
and POSIX_FADV_RANDOM to hint random access.  No?

ISTM, _WILLNEED seems just right for small random-access blocks.



Anyway, for those who want to see what they do in Linux,
  http://www.gelato.unsw.edu.au/lxr/source/mm/fadvise.c
Pretty scary that Bruce said it could make older linuxes
dump core - there isn't a lot of code there.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Simon Riggs
On Tue, 2006-10-24 at 14:07 -0400, Tom Lane wrote:
 Gregory Maxwell [EMAIL PROTECTED] writes:
  I'm not aware of any other system which can guaranteed the atomicity
  of 8k writes.
 
 The reasoning for supporting full_page_writes = off is that if you have
 a stable kernel and suitable backup power, battery backed write cache,
 etc, your risk of a partially completed write() may be low enough to
 be acceptable.  Obviously there are no 100.00% guarantees, but that's
 what you keep backups for ...
 
 Simon is essentially arguing that if we are willing to assume no
 incomplete write() we may as well assume it for WAL too.  This seems
 to me to be raising the risk significantly, but I admit that I can't
 put my finger on why exactly.

I agree about the significant additional risk, hence the additional
parameter.

I'll do some internal testing to see what the risk-reward is. If that
seems worthwhile, then I'll post the patch for general testing/comment. 

(Incidentally, having GUCs that depend on other GUCs is bad news since
they are set alphabetically. I'd want to only allow wal_checksum=off iff
full_page_writes=off, which will work, but only because W comes after F
and for no other reason. Generic solution for dependent GUCs would be
great...)

 One point not directly related to crash safety is whether CRC checking
 isn't still going to be a good idea when PITR is enabled.  Archived WAL
 files are going to have been through a great deal more transferring than
 ones merely being used to recover from a crash.

Agreed. Both disks and tapes/other mechanisms must be known CRC-safe
before this idea would be worth using in production. Many enterprises do
already think they have bomb-proof kit, so we may as well support them
in that belief. 

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [SPAM?] Re: Asynchronous I/O Support

2006-10-24 Thread Martijn van Oosterhout
On Tue, Oct 24, 2006 at 12:53:23PM -0700, Ron Mayer wrote:
 Anyway, for those who want to see what they do in Linux,
   http://www.gelato.unsw.edu.au/lxr/source/mm/fadvise.c
 Pretty scary that Bruce said it could make older linuxes
 dump core - there isn't a lot of code there.

The bug was probably in the glibc interface to the kernel. Google found
this:

http://sourceware.org/ml/libc-hacker/2004-03/msg0.html

i.e. posix_fadvise appears to have been broken on all 64-bit
architechtures prior to March 2004 due to a silly linking error.

And then things like this:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=313219

Which suggest that prior to glibc 2.3.5, posix_fadvise crashed on 2.4
kernels. That's a fairly recent version, so the bug would still be
fairly widespead.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Magnus Hagander
  Sorry - we're just talking about getting the version 
 number in there 
  automatically to avoid it getting forgotten during release 
 bundling.
 
  I can see that being a good idea. But I don't see Toms ./configure 
  solution working.
 
 Why not?  The shipped tarball would contain exactly the same
 pg_config.h.win32 it does today; the only difference is that 
 the version info would've been inserted automatically instead 
 of manually.
 (The start of this discussion was my observation that 
 pg_config.h.win32 contains multiple copies of the version 
 info, and sooner or later somebody would miss one while 
 stamping a release.)

Right. And then you can only build from tarball and not from CVS, right?
Because the pg_config.h.win32 with version is actually in cvs. Or an I
missing something here?


  What we could do is have the msvc build scripts edit the file and 
  replace the version with something it reads from 
 configure.in when run.
 
 That's great if you're using msvc, but what about borland?

Good point. But we could always make that part of the script a separate
one that can be run for Borland as welll.

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Why not?  The shipped tarball would contain exactly the same
 pg_config.h.win32 it does today; the only difference is that 
 the version info would've been inserted automatically instead 
 of manually.

 Right. And then you can only build from tarball and not from CVS, right?

But isn't that true for borland already?  We've got the .def files
for libpq and the help-file for psql that we don't expect a borland
environment to be able to build.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Release stamping (Was: [CORE] Schedule for

2006-10-24 Thread Dave Page
Title: Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?) 






On 24/10/06 21:59, Magnus Hagander [EMAIL PROTECTED] wrote:


Right. And then you can only build from tarball and not from CVS, right?
Because the pg_config.h.win32 with version is actually in cvs. Or an I
missing something here?

I suggested checking the result into CVS, per the configure script.

Regards, Dave.





Re: [HACKERS] Release stamping (Was: [CORE] Schedule for release?)

2006-10-24 Thread Magnus Hagander
  Why not?  The shipped tarball would contain exactly the same
  pg_config.h.win32 it does today; the only difference is that the 
  version info would've been inserted automatically instead of 
  manually.
 
  Right. And then you can only build from tarball and not 
 from CVS, right?
 
 But isn't that true for borland already?  We've got the .def 
 files for libpq and the help-file for psql that we don't 
 expect a borland environment to be able to build.

Borland, yes. MSVC, no. So I guess we could hav ethe MSVC bulid script
generate it for MSVC, and then stick it in the tarball for borland. It's
two different ways to do the same thing, but I guess we can do that..

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [JDBC] server process (PID 1188) exited with exit code -1073741819, 8.2 beta1

2006-10-24 Thread Dave Cramer
This is a server bug, I will post to hackers for you, it has little  
to do with JDBC, however the ? can't be a column in a prepared statement


DAVE
On 24-Oct-06, at 4:45 PM, JEAN-PIERRE PELLETIER wrote:


Hi,

I have a query that throws org.postgresql.util.PSQLException: An I/ 
O error occured while sending to the backend. Here is a simplified  
version of it where I used pg_database, but any table or view do  
the same.


PreparedStatement pstmt = connection.prepareStatement(select ?  
from (select * from pg_database) t);

pstmt.setObject(1, 1);
ResultSet rs = pstmt.executeQuery();

It restarts PostgreSQL with this in the Log
2006-10-24 15:35:38 LOG:  server process (PID 1188) exited with  
exit code -1073741819
2006-10-24 15:35:38 LOG:  terminating any other active server  
processes
2006-10-24 15:35:38 LOG:  all server processes terminated;  
reinitializing
2006-10-24 15:35:39 LOG:  database system was interrupted at  
2006-10-24 15:34:54 Eastern Standard Time
2006-10-24 15:35:39 LOG:  Windows fopen(recovery.conf,r)  
failed: code 2, errno 2
2006-10-24 15:35:39 LOG:  Windows fopen(pg_xlog/ 
0001.history,r) failed: code 2, errno 2
2006-10-24 15:35:39 LOG:  Windows fopen(backup_label,r) failed:  
code 2, errno 2

2006-10-24 15:35:39 LOG:  checkpoint record is at 0/31FDF0A0
2006-10-24 15:35:39 LOG:  redo record is at 0/31FDF0A0; undo record  
is at 0/0; shutdown TRUE
2006-10-24 15:35:39 LOG:  next transaction ID: 0/22535; next OID:  
101011

2006-10-24 15:35:39 LOG:  next MultiXactId: 1; next MultiXactOffset: 0
2006-10-24 15:35:39 LOG:  database system was not properly shut  
down; automatic recovery in progress

2006-10-24 15:35:39 LOG:  record with zero length at 0/31FDF0F0
2006-10-24 15:35:39 LOG:  redo is not required
2006-10-24 15:35:40 LOG:  database system is ready
2006-10-24 15:35:40 LOG:  Windows fopen(global/pg_fsm.cache,rb)  
failed: code 2, errno 2
2006-10-24 15:35:40 LOG:  transaction ID wrap limit is 2147484172,  
limited by database postgres
2006-10-24 15:36:40 LOG:  Windows fopen(global/pgstat.stat,rb)  
failed: code 2, errno 2
2006-10-24 15:36:40 LOG:  Windows fopen(global/pgstat.stat,rb)  
failed: code 2, errno 2


I am using PostgreSQL 8.2 beta1 on Windows XP, Service Pack 2
Both JDBC3 driver 8.1 build 407 and 8.2 build 503 do the same.

Thanks,
Jean-Pierre Pelletier
e-djuster



---(end of  
broadcast)---

TIP 6: explain analyze is your friend




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [JDBC] server process (PID 1188) exited with exit code -1073741819, 8.2 beta1

2006-10-24 Thread Tom Lane
Dave Cramer [EMAIL PROTECTED] writes:
 This is a server bug, I will post to hackers for you,

Please provide a complete test case.  I tried to reproduce the failure
in libpq, with

/* Here is our out-of-line parameter value */
paramValues[0] = joe's place;

res = PQexecParams(conn,
   SELECT $1 FROM (select * from pg_database) t,
   1,/* one param */
   NULL,/* let the backend deduce param type */
   paramValues,
   NULL,/* don't need param lengths since text */
   NULL,/* default to all text params */
   0);  /* ask for text results */

and got nothing worse than

SELECT failed: ERROR:  could not determine data type of parameter $1


regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [JDBC] server process (PID 1188) exited with exit code

2006-10-24 Thread Oliver Jowett

Tom Lane wrote:


   NULL,/* let the backend deduce param type */


I think the JDBC driver will be passing the int4 OID for the param type 
in this case.


Best thing is probably for the OP to run with loglevel=2 and see exactly 
what's being sent, though.


-O

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Incorrect behavior with CE and ORDER BY

2006-10-24 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Is this possible?  It would be very fast.

It's possible but not exactly simple.  As an example, your proposed
plan:

 Limit (50)
   Sort (key: pse_lastlogin)
 Result
Append
   Limit (50)
SeqScan tbl_profile_search
 Limit (50)
Indexscan tbl_profile_search_interest_1
 Limit (50)
IndexScan on the index mentioned above

is wrong because there's no guarantee that the first 50 elements of a
seqscan will be anything special.  You could imagine dealing with that
by sorting the seqscan results and limiting to 50, or by not
sorting/limiting that data at all but letting the upper sort see all the
seqscan entries.  Offhand I think either of those could win depending on
how many elements the seqscan will yield.  Also, it might be interesting
to consider inventing a merge plan node type that takes N
already-sorted inputs and produces a sorted output stream.  Then we'd
need to trade off this approach versus doing the top-level sort, which
could cope with some of its inputs not being pre-sorted.

This seems to have some aspects in common with the recent discussion
about how to optimize min/max aggregates across an appendrel set.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] materialised view

2006-10-24 Thread rajesh boppana
 i want to implement materialized views in postgresql . to do as i want to modify the code in backend but i don't know what r the files i have to modify. so please help me by mentioning about the backend code.


Re: [HACKERS] materialised view

2006-10-24 Thread Jaime Casanova

On 10/24/06, rajesh boppana [EMAIL PROTECTED] wrote:

 i want to implement materialized views in postgresql . to do as i
want to modify the code in backend but i don't know what r the files i have
to modify. so please help me by mentioning about the backend code.



http://www.postgresql.org/developer/coding

--
regards,
Jaime Casanova

Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning.
  Richard Cook

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Tom Lane
Gurjeet Singh [EMAIL PROTECTED] writes:
 On 10/23/06, Tom Lane [EMAIL PROTECTED]  wrote:
 I didn't particularly trust the timing calculations in your benchmark
 program,

 Any particular reason? (why and what did you doubt in it?).

Well, the specific thing that set off my bogometer was

#define TV_DIFF_MILLI(tv1, tv2) 
((tv2.tv_sec*1000+((tv2.tv_usec)/1000))-(tv1.tv_sec*1000+((tv1.tv_usec)/1000)))

which is going to have overflow problems on any platform where tv_sec
isn't a 64-bit type (which is still all of 'em AFAIK).  But more
generally, your test is timing a CRC across 100 4Kb segments, which
isn't representative of PG's usage of CRCs.  I don't think there are
any XLogWrite calls that have more than about 5 segments, and in most
cases those segments contain a few dozen bytes not a few K.  So you
need to be looking at much shorter loop runs.

The test case I proposed uses timing code that I trusted (borrowed from
long-established coding in postgres.c), and tests loop lengths that are
somewhat representative for PG, but it is still biased in favor of slice8
because it repeats the CRC calculations consecutively without any other
activity --- presumably this fact creates a bias for a method that needs
more L2 cache space over one that doesn't need so much.  I'd have tried
harder to make an unbiased test case if this version had showed slice8 as
competitive, but so far it seems that on a majority of current CPUs and
compilers it's not competitive.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Reg external sorting alogrithm

2006-10-24 Thread Praveen Kumar N

Hi,
	can anybody tell me what is the computational complexity of 
external sorting algorithm used by postgres in terms of time and space.
And one more question is how does # of DISK I/O's vary by varying jsf and 
size of data while using external sorting algorithms to sort(I mean is it 
like linear relation ship or in terms of logarithmic function).


thanks in anticipation.

Regards,
 N Praveen Kumar


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Reg external sorting alogrithm

2006-10-24 Thread Tom Lane
Praveen Kumar N [EMAIL PROTECTED] writes:
   can anybody tell me what is the computational complexity of 
 external sorting algorithm used by postgres in terms of time and space.

See the comments at the head of tuplesort.c:
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/sort/tuplesort.c

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Simon Riggs
On Tue, 2006-10-24 at 12:34 -0700, Joshua D. Drake wrote:
  Looking at that, I'm a) missing PgCluster and b) arguing that we have to
  admit that we simply can not 'list .. replication solutions ... and how
  to get them' because all of the solutions mentioned need quite some
  knowledge and require a more or less complex installation and
  configuration.
 
 There is also the question if we should have a sub section:
 
 Closed Source replication solutions:
 
 Mammoth Replicator
 Continuent P/Cluster
 ExtenDB
 Greenplum MPP (although this is kind of horizontal partitioning)

Where do you draw the line? You maybe surprised about what other options
that includes. I'm happy to include a whole range of things, but please
be very careful and precise about what you wish for.

There's enough good solutions for open source PostgreSQL that it is easy
and straightforward to limit it to just that. New contributions welcome,
of course.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Joshua D. Drake
Simon Riggs wrote:
 On Tue, 2006-10-24 at 12:34 -0700, Joshua D. Drake wrote:
 Looking at that, I'm a) missing PgCluster and b) arguing that we have to
 admit that we simply can not 'list .. replication solutions ... and how
 to get them' because all of the solutions mentioned need quite some
 knowledge and require a more or less complex installation and
 configuration.
 There is also the question if we should have a sub section:

 Closed Source replication solutions:

 Mammoth Replicator
 Continuent P/Cluster
 ExtenDB
 Greenplum MPP (although this is kind of horizontal partitioning)
 
 Where do you draw the line?

Well that is certainly a good question but we do include links to some
of the more prominent closed source software on the website as well.

 You maybe surprised about what other options
 that includes. I'm happy to include a whole range of things, but please
 be very careful and precise about what you wish for.

If it were me, I would say that the replication option has to be
specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
doesn't go in).

Sincerely,

Joshua D. Drake



-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [DOCS] Replication documentation addition

2006-10-24 Thread Simon Riggs
On Tue, 2006-10-24 at 00:20 -0400, Bruce Momjian wrote:
 Here is a new replication documentation section I want to add for 8.2:
 
   ftp://momjian.us/pub/postgresql/mypatches/replication
 
 Comments welcomed.

It's a very good start to a complete minefield of competing solutions.

My first thought would be to differentiate between clustering and
replication, which will bring out many differences.

My second thought would be to differentiate between load balancing,
multi-threading, parallel query, high availability and recoverability,
which would probably sort out the true differences in the above mix. But
that wouldn't help most people and almost everybody would find fault.

IMHO most people I've spoken to take replication to mean an HA
solution, so perhaps we should cover it in those terms.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Simon Riggs
On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:

 If it were me, I would say that the replication option has to be
 specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
 doesn't go in).

...and how do you define PostgreSQL exactly?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Joshua D. Drake
Simon Riggs wrote:
 On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:
 
 If it were me, I would say that the replication option has to be
 specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
 doesn't go in).
 
 ...and how do you define PostgreSQL exactly?

I replication product or software defined to work with only PostgreSQL?

I know there are some other products out there that will work from one
db to another, but I am not sure if those would be considered HA
solutions or migration solutions (which we could certainly document).

Sincerely,

Joshua D. Drake




-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Simon Riggs
On Tue, 2006-10-24 at 15:33 -0700, Joshua D. Drake wrote:
 Simon Riggs wrote:
  On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:
  
  If it were me, I would say that the replication option has to be
  specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
  doesn't go in).
  
  ...and how do you define PostgreSQL exactly?
 
 I replication product or software defined to work with only PostgreSQL?

(again)... how do you define PostgreSQL exactly?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Joshua D. Drake
Simon Riggs wrote:
 On Tue, 2006-10-24 at 15:33 -0700, Joshua D. Drake wrote:
 Simon Riggs wrote:
 On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:

 If it were me, I would say that the replication option has to be
 specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
 doesn't go in).
 ...and how do you define PostgreSQL exactly?
 I replication product or software defined to work with only PostgreSQL?
 
 (again)... how do you define PostgreSQL exactly?

What about PostgreSQL is unclear? Is your question do I consider
EnterpriseDB, PostgreSQL? I have no comment on that matter.

Sincerely,

Joshua D. Drake




-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Jim C. Nasby
On Mon, Oct 23, 2006 at 11:39:34PM -0400, Bruce Momjian wrote:
 Query Broadcast Replication
 ---
 
 This involves sending write queries to multiple servers.  Read-only
 queries can be sent to a single server because there is no need for all
 servers to process it.   This can be complex to setup because functions
 like random() and CURRENT_TIMESTAMP will have different values on
 different servers, and sequences should be consistent across servers.
 Pgpool implements this type of replication.

Isn't there another active project that does this besides pgpool?

It's probably also worth mentioning the commercial replication schemes
that are out there.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Jim C. Nasby
On Tue, Oct 24, 2006 at 03:33:03PM -0700, Joshua D. Drake wrote:
 Simon Riggs wrote:
  On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:
  
  If it were me, I would say that the replication option has to be
  specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
  doesn't go in).
  
  ...and how do you define PostgreSQL exactly?
 
 I replication product or software defined to work with only PostgreSQL?
 
AFAIK Continuent's product fails that test...

I don't see any reason to exclude things that work with databases other
than PostgreSQL, though I agree that replication that's actually in the
application space (ie: it ties you to TomCat or some other platform)
probably doesn't belong.

My feeling is that people reading this chapter are looking for solutions
and probably don't care as much about how exactly the solution works so
long as it meets their needs.

 I know there are some other products out there that will work from one
 db to another, but I am not sure if those would be considered HA
 solutions or migration solutions (which we could certainly document).
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Joshua D. Drake
Jim C. Nasby wrote:
 On Tue, Oct 24, 2006 at 03:33:03PM -0700, Joshua D. Drake wrote:
 Simon Riggs wrote:
 On Tue, 2006-10-24 at 15:13 -0700, Joshua D. Drake wrote:

 If it were me, I would say that the replication option has to be
 specific to PostgreSQL (e.g; cjdbc or synchronous jakarta pooling
 doesn't go in).
 ...and how do you define PostgreSQL exactly?
 I replication product or software defined to work with only PostgreSQL?
  
 AFAIK Continuent's product fails that test...

To my knowledge, p/cluster only works with PostgreSQL but I could be wrong.

 
 I don't see any reason to exclude things that work with databases other
 than PostgreSQL, though I agree that replication that's actually in the
 application space (ie: it ties you to TomCat or some other platform)
 probably doesn't belong.

I was just trying to have a defined criteria of some sort. We could fill
up pages and pages of possible replication solutions :)

Joshua D. Drake



-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Bruce Momjian

OK, I have updated the URL.  Please let me know how you like it.

---

Hannu Krosing wrote:
 ?hel kenal p?eval, T, 2006-10-24 kell 00:20, kirjutas Bruce Momjian:
  Here is a new replication documentation section I want to add for 8.2:
  
  ftp://momjian.us/pub/postgresql/mypatches/replication
 
 This is how data partitioning is currently described there
 
  Data Partitioning
  -
  
  Data partitioning splits the database into data sets.  To achieve
  replication, each data set can only be modified by one server.  For
  example, data can be partitioned by offices, e.g. London and Paris. 
  While London and Paris servers have all data records, only London can
  modify London records, and Paris can only modify Paris records.  Such
  partitioning is usually accomplished in application code, though rules
  and triggers can help enforce partitioning and keep the read-only data
  sets current.  Slony can also be used in such a setup.  While Slony
  replicates only entire tables, London and Paris can be placed in
  separate tables, and inheritance can be used to access from both tables
  using a single table name.
 
 Maybe another use of partitioning should also be mentioned. That is ,
 when partitioning is used to overcome limitations of single servers
 (especially IO and memory, but also CPU), and only a subset of data is
 stored and processed on each server.
 
 As an example of this type of partitioning you could mention Bizgres MPP
 (a PG-based commercial product, http://www.greenplum.com ), which
 partitions data to use I/O and CPU of several DB servers for processing
 complex OLAP queries, and Pl_Proxy
 ( http://pgfoundry.org/projects/plproxy/ ) which does the same for OLTP
 loads.
 
 I think the official term for this kind of replication is
 Shared-Nothing Clustering.
 
 -- 
 
 Hannu Krosing
 Database Architect
 Skype Technologies O?
 Akadeemia tee 21 F, Tallinn, 12618, Estonia
 
 Skype me:  callto:hkrosing
 Get Skype for free:  http://www.skype.com
 

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Jeff Frost

On Tue, 24 Oct 2006, Joshua D. Drake wrote:


AFAIK Continuent's product fails that test...


To my knowledge, p/cluster only works with PostgreSQL but I could be wrong.



p/cluster was the old name for the PostgreSQL specific version.  It's been 
rebranded as uni/cluster and they have versions for both PostgreSQL and MySQL. 
One of my customers is trying it out currently.


--
Jeff Frost, Owner   [EMAIL PROTECTED]
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Luke Lonergan
Bruce, 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Bruce Momjian
 Sent: Tuesday, October 24, 2006 5:16 PM
 To: Hannu Krosing
 Cc: PostgreSQL-documentation; PostgreSQL-development
 Subject: Re: [HACKERS] Replication documentation addition
 
 
 OK, I have updated the URL.  Please let me know how you like it.

There's a typo on line 8, first paragraph:

perhaps with only one server allowing write rwork together at the same
time.

Also, consider this wording of the last description:

Single-Query Clustering...

Replaced by:

Shared Nothing Clustering
---

This allows multiple servers with separate disks to work together on a
each query.
In shared nothing clusters, the work of answering each query is
distributed among
the servers to increase the performance through parallelism.  These
systems will
typically feature high availability by using other forms of replication
internally.

While there are no open source options for this type of clustering,
there are several
commercial products available that implement this approach, making
PostgreSQL achieve
very high performance for multi-Terabyte business intelligence
databases.

- Luke


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [DOCS] Replication documentation addition

2006-10-24 Thread Bruce Momjian

I have changed the text to reference fail over and load balancing. 
I think it makes it clearer.  Let me know what you think.  I am hesitant
to mention commercial PostgreSQL products in our documentation.

---

Markus Schiltknecht wrote:
 Hello Bruce,
 
 Bruce Momjian wrote:
  Here is a new replication documentation section I want to add for 8.2:
  
  ftp://momjian.us/pub/postgresql/mypatches/replication
  
  Comments welcomed.
 
 Thank you, that sounds good. It's targeted to production use and 
 currently available solutions, which makes sense in the official manual.
 
 You are explaining the sync vs. async categorization, but I sort of 
 asked myself where the explanation of single vs multi-master has gone. I 
 then realized, that you are talking about read-only and a read/write 
 mix of servers. Then again, you are mentioning 'Multi-Master 
 Replication' as one type of replication solutions. I think we should be 
 consistent in our naming. As Single- and Multi-Master are the more 
 common terms among database replication experts, I'd recommend to use 
 them and explain what they mean instead of introducing new names.
 
 Along with that, I'd argue that this Single- or Multi-Master is a 
 categorization as Sync vs Async. In that sense, the last chapter should 
 probably be named 'Distributed-Shared-Memory Replication' or something 
 like that instead of 'Multi-Master Replication', because as we know, 
 there are several ways of doing Multi-Master Replication (Slony-II / 
 Postgres-R, Distributed Shared Memory, 2PC in application code or the 
 above mentioned 'Query Broadcast Replication', which would fall into a 
 Multi-Master Replication model as well)
 
 Also in the last chapter, instead of just saying that PostgreSQL does 
 not offer this type of replication, we could probably say that 
 different projects are trying to come up with better replication 
 solutions. And there are several proprietary products based on 
 PostgreSQL which do solve some kinds of Multi-Master Replication. Not 
 that I want to advertise for any of them, but it just sounds better than 
 the current no, we don't offer that.
 
 As this documentation mainly covers production-quality solutions (which 
 is absolutely perfect), can we document the status of current projects 
 somewhere, probably in a wiki? Or at least mention them somewhere and 
 point to their websites? It would help to get rid of all those rumors 
 and uncertainties. Or are those intentional?
 
 Just my two cents.
 
 Regards
 
 Markus
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [DOCS] Replication documentation addition

2006-10-24 Thread Bruce Momjian
Simon Riggs wrote:
 On Tue, 2006-10-24 at 00:20 -0400, Bruce Momjian wrote:
  Here is a new replication documentation section I want to add for 8.2:
  
  ftp://momjian.us/pub/postgresql/mypatches/replication
  
  Comments welcomed.
 
 It's a very good start to a complete minefield of competing solutions.
 
 My first thought would be to differentiate between clustering and
 replication, which will bring out many differences.

I have gone with fail-over and load balancing in the updated text.

 My second thought would be to differentiate between load balancing,
 multi-threading, parallel query, high availability and recoverability,
 which would probably sort out the true differences in the above mix. But
 that wouldn't help most people and almost everybody would find fault.

Yep.

 IMHO most people I've spoken to take replication to mean an HA
 solution, so perhaps we should cover it in those terms.

Yes, I removed any reference to replication.  It seemed too general.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Bruce Momjian

I have updated the text.  Please let me know what else I should change. 
I am unsure if I should be mentioning commercial PostgreSQL products in
our documentation.

---

Hannu Krosing wrote:
 ?hel kenal p?eval, T, 2006-10-24 kell 00:20, kirjutas Bruce Momjian:
  Here is a new replication documentation section I want to add for 8.2:
  
  ftp://momjian.us/pub/postgresql/mypatches/replication
 
 This is how data partitioning is currently described there
 
  Data Partitioning
  -
  
  Data partitioning splits the database into data sets.  To achieve
  replication, each data set can only be modified by one server.  For
  example, data can be partitioned by offices, e.g. London and Paris. 
  While London and Paris servers have all data records, only London can
  modify London records, and Paris can only modify Paris records.  Such
  partitioning is usually accomplished in application code, though rules
  and triggers can help enforce partitioning and keep the read-only data
  sets current.  Slony can also be used in such a setup.  While Slony
  replicates only entire tables, London and Paris can be placed in
  separate tables, and inheritance can be used to access from both tables
  using a single table name.
 
 Maybe another use of partitioning should also be mentioned. That is ,
 when partitioning is used to overcome limitations of single servers
 (especially IO and memory, but also CPU), and only a subset of data is
 stored and processed on each server.
 
 As an example of this type of partitioning you could mention Bizgres MPP
 (a PG-based commercial product, http://www.greenplum.com ), which
 partitions data to use I/O and CPU of several DB servers for processing
 complex OLAP queries, and Pl_Proxy
 ( http://pgfoundry.org/projects/plproxy/ ) which does the same for OLTP
 loads.
 
 I think the official term for this kind of replication is
 Shared-Nothing Clustering.
 
 -- 
 
 Hannu Krosing
 Database Architect
 Skype Technologies O?
 Akadeemia tee 21 F, Tallinn, 12618, Estonia
 
 Skype me:  callto:hkrosing
 Get Skype for free:  http://www.skype.com
 
 
 
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Bruce Momjian
Markus Schiltknecht wrote:
 Looking at that, I'm a) missing PgCluster and b) arguing that we have to 
 admit that we simply can not 'list .. replication solutions ... and how 
 to get them' because all of the solutions mentioned need quite some 
 knowledge and require a more or less complex installation and configuration.

Where is pgcluster in terms of usability?  Should I mention it?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Bruce Momjian

I don't think the PostgreSQL documentation should be mentioning
commercial solutions.

---

Luke Lonergan wrote:
 Bruce, 
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Bruce Momjian
  Sent: Tuesday, October 24, 2006 5:16 PM
  To: Hannu Krosing
  Cc: PostgreSQL-documentation; PostgreSQL-development
  Subject: Re: [HACKERS] Replication documentation addition
  
  
  OK, I have updated the URL.  Please let me know how you like it.
 
 There's a typo on line 8, first paragraph:
 
 perhaps with only one server allowing write rwork together at the same
 time.
 
 Also, consider this wording of the last description:
 
 Single-Query Clustering...
 
 Replaced by:
 
 Shared Nothing Clustering
 ---
 
 This allows multiple servers with separate disks to work together on a
 each query.
 In shared nothing clusters, the work of answering each query is
 distributed among
 the servers to increase the performance through parallelism.  These
 systems will
 typically feature high availability by using other forms of replication
 internally.
 
 While there are no open source options for this type of clustering,
 there are several
 commercial products available that implement this approach, making
 PostgreSQL achieve
 very high performance for multi-Terabyte business intelligence
 databases.
 
 - Luke

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Bruce Momjian
Joshua D. Drake wrote:
 
  Looking at that, I'm a) missing PgCluster and b) arguing that we have to
  admit that we simply can not 'list .. replication solutions ... and how
  to get them' because all of the solutions mentioned need quite some
  knowledge and require a more or less complex installation and
  configuration.
 
 There is also the question if we should have a sub section:
 
 Closed Source replication solutions:
 
 Mammoth Replicator
 Continuent P/Cluster
 ExtenDB
 Greenplum MPP (although this is kind of horizontal partitioning)

I vote no.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Josh Berkus
Bruce,

 I have updated the text.  Please let me know what else I should change.
 I am unsure if I should be mentioning commercial PostgreSQL products in
 our documentation.

I think you should mention the postgresql-only ones, but just briefly with a 
link.  Bizgres MPP, ExtenDB, uni/cluster, and Mammoth Replicator.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Replication documentation addition

2006-10-24 Thread Joshua D. Drake
Josh Berkus wrote:
 Bruce,
 
 I have updated the text.  Please let me know what else I should change.
 I am unsure if I should be mentioning commercial PostgreSQL products in
 our documentation.
 
 I think you should mention the postgresql-only ones, but just briefly with a 
 link.  Bizgres MPP, ExtenDB, uni/cluster, and Mammoth Replicator.

And to further this I would expect that it would be a subsection.. e.g;
a sect2 or sect3. I think the open source version should absolutely
get top billing though.

Sincerely,

Joshua D. Drake




-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Bruce Momjian
Joshua D. Drake wrote:
 Josh Berkus wrote:
  Bruce,
  
  I have updated the text.  Please let me know what else I should change.
  I am unsure if I should be mentioning commercial PostgreSQL products in
  our documentation.
  
  I think you should mention the postgresql-only ones, but just briefly with 
  a 
  link.  Bizgres MPP, ExtenDB, uni/cluster, and Mammoth Replicator.
 
 And to further this I would expect that it would be a subsection.. e.g;
 a sect2 or sect3. I think the open source version should absolutely
 get top billing though.

I am not inclined to add commercial offerings.  If people wanted
commercial database offerings, they can get them from companies that
advertize.  People are coming to PostgreSQL for open source solutions,
and I think mentioning commercial ones doesn't make sense.

If we are to add them, I need to hear that from people who haven't
worked in PostgreSQL commerical replication companies.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Steve Atkins


On Oct 24, 2006, at 8:48 PM, Bruce Momjian wrote:


Joshua D. Drake wrote:

Josh Berkus wrote:

Bruce,

I have updated the text.  Please let me know what else I should  
change.
I am unsure if I should be mentioning commercial PostgreSQL  
products in

our documentation.


I think you should mention the postgresql-only ones, but just  
briefly with a

link.  Bizgres MPP, ExtenDB, uni/cluster, and Mammoth Replicator.


And to further this I would expect that it would be a subsection..  
e.g;
a sect2 or sect3. I think the open source version should  
absolutely

get top billing though.


I am not inclined to add commercial offerings.  If people wanted
commercial database offerings, they can get them from companies that
advertize.  People are coming to PostgreSQL for open source solutions,
and I think mentioning commercial ones doesn't make sense.

If we are to add them, I need to hear that from people who haven't
worked in PostgreSQL commerical replication companies.


I'm not coming to PostgreSQL for open source solutions. I'm coming
to PostgreSQL for _good_ solutions.

I want to see what solutions might be available for a problem I have.
I certainly want to know whether they're freely available, commercial
or some flavour of open source, but I'd like to know about all of them.

A big part of the value of Postgresql is the applications and extensions
that support it. Hiding the existence of some subset of those just
because of the way they're licensed is both underselling postgresql
and doing something of a disservice to the user of the document.

Cheers,
  Steve

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Bruce Momjian
Steve Atkins wrote:
  If we are to add them, I need to hear that from people who haven't
  worked in PostgreSQL commerical replication companies.
 
 I'm not coming to PostgreSQL for open source solutions. I'm coming
 to PostgreSQL for _good_ solutions.
 
 I want to see what solutions might be available for a problem I have.
 I certainly want to know whether they're freely available, commercial
 or some flavour of open source, but I'd like to know about all of them.
 
 A big part of the value of Postgresql is the applications and extensions
 that support it. Hiding the existence of some subset of those just
 because of the way they're licensed is both underselling postgresql
 and doing something of a disservice to the user of the document.

OK, does that mean we mention EnterpriseDB in the section about Oracle
functions?  Why not mention MS SQL if they have a better solution?  I
just don't see where that line can clearly be drawn on what to include.
Do we mention Netiza, which is loosely based on PostgreSQL?   It just
seems very arbitrary to include commercial software.  If someone wants
to put in on a wiki, I think that would be fine because that doesn't
seems as official.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-24 Thread Steve Atkins


On Oct 24, 2006, at 9:20 PM, Bruce Momjian wrote:


Steve Atkins wrote:

If we are to add them, I need to hear that from people who haven't
worked in PostgreSQL commerical replication companies.


I'm not coming to PostgreSQL for open source solutions. I'm coming
to PostgreSQL for _good_ solutions.

I want to see what solutions might be available for a problem I have.
I certainly want to know whether they're freely available, commercial
or some flavour of open source, but I'd like to know about all of  
them.


A big part of the value of Postgresql is the applications and  
extensions

that support it. Hiding the existence of some subset of those just
because of the way they're licensed is both underselling postgresql
and doing something of a disservice to the user of the document.


OK, does that mean we mention EnterpriseDB in the section about Oracle
functions?  Why not mention MS SQL if they have a better solution?  I
just don't see where that line can clearly be drawn on what to  
include.

Do we mention Netiza, which is loosely based on PostgreSQL?   It just
seems very arbitrary to include commercial software.  If someone wants
to put in on a wiki, I think that would be fine because that doesn't
seems as official.


Good question. The line needs to be drawn somewhere. It's basically
your judgement, tempered by other peoples feedback, though. If it
were me, I'd ask myself Would I mention this product if it were open
source? Would mentioning it help people using the document?.

Cheers,
  Steve


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org