Re: [HACKERS] Buildfarm issues on specific machines

2005-07-16 Thread Kris Jurka


On Sun, 17 Jul 2005, Tom Lane wrote:

> The short answer is that you should install flex 2.5.4, or else forget
> about testing the 7.2 branch.  I don't think anyone will be very
> interested in making 7.2 work with flex 2.5.31.
> 

Actually there are problems in the 7.3 branch as well in the cube,
tsearch, and seg modules.  Here are some patches for the 7.2 version check
and 7.2 and 7.3 tsearch code.  I'll work on getting cube and seg up to
speed as well if people agree we want these fixes.

Kris JurkaIndex: config/programs.m4
===
RCS file: /projects/cvsroot/pgsql/config/programs.m4,v
retrieving revision 1.7
diff -c -r1.7 programs.m4
*** config/programs.m4  28 Aug 2001 14:59:11 -  1.7
--- config/programs.m4  16 Jul 2005 18:11:24 -
***
*** 26,32 
then
  echo '%%'  > conftest.l
  if $pgac_candidate -t conftest.l 2>/dev/null | grep FLEX_SCANNER 
>/dev/null 2>&1; then
!   if $pgac_candidate --version | grep '2\.5\.3' >/dev/null 2>&1; then
  pgac_broken_flex=$pgac_candidate
  continue
fi
--- 26,32 
then
  echo '%%'  > conftest.l
  if $pgac_candidate -t conftest.l 2>/dev/null | grep FLEX_SCANNER 
>/dev/null 2>&1; then
!   if $pgac_candidate --version | grep '2\.5\.3$' >/dev/null 2>&1; then
  pgac_broken_flex=$pgac_candidate
  continue
fi
? contrib/tsearch/libtsearch.so.0.0
? contrib/tsearch/parser.c
? contrib/tsearch/results
? contrib/tsearch/tsearch.sql
Index: contrib/tsearch/parser.h
===
RCS file: /projects/cvsroot/pgsql/contrib/tsearch/Attic/parser.h,v
retrieving revision 1.3
diff -c -r1.3 parser.h
*** contrib/tsearch/parser.h28 Oct 2001 06:25:41 -  1.3
--- contrib/tsearch/parser.h16 Jul 2005 18:11:53 -
***
*** 5,11 
  int   tokenlen;
  int   tsearch_yylex(void);
  void  start_parse_str(char *, int);
- void  start_parse_fh(FILE *, int);
  void  end_parse(void);
  
  #endif
--- 5,10 
Index: contrib/tsearch/parser.l
===
RCS file: /projects/cvsroot/pgsql/contrib/tsearch/Attic/parser.l,v
retrieving revision 1.1
diff -c -r1.1 parser.l
*** contrib/tsearch/parser.l12 Oct 2001 23:19:09 -  1.1
--- contrib/tsearch/parser.l16 Jul 2005 18:11:53 -
***
*** 1,56 
  %{
  #include 
  #include "deflex.h"
  #include "parser.h"
  
- /* postgres allocation function */
- #include "postgres.h"
- #define free  pfree
- #define mallocpalloc
- #define realloc repalloc
- 
- #ifdef strdup
- #undef strdup
- #endif
- #define strduppstrdup
- 
- 
  char *token = NULL;  /* pointer to token */
  char *s = NULL;  /* for returning full defis-word */
  
  YY_BUFFER_STATE buf = NULL; /* buffer to parse; it need for parse from string 
*/
  
- int lrlimit = -1; /* for limiting read from filehandle ( -1 - unlimited 
read ) */
- int bytestoread = 0;  /* for limiting read from filehandle */
- 
- /* redefine macro for read limited length */
- #define YY_INPUT(buf,result,max_size) \
-   if ( yy_current_buffer->yy_is_interactive ) { \
- int c = '*', n; \
- for ( n = 0; n < max_size && \
-  (c = getc( tsearch_yyin )) != EOF && c != '\n'; 
++n ) \
- buf[n] = (char) c; \
- if ( c == '\n' ) \
- buf[n++] = (char) c; \
- if ( c == EOF && ferror( tsearch_yyin ) ) \
- YY_FATAL_ERROR( "input in flex scanner failed" ); \
- result = n; \
- }  else { \
-   if ( lrlimit == 0 ) \
-   result=YY_NULL; \
-   else { \
-   if ( lrlimit>0 ) { \
-   bytestoread = ( lrlimit > max_size ) ? max_size 
: lrlimit; \
-   lrlimit -= bytestoread; \
-   } else \
-   bytestoread = max_size; \
-   if ( ((result = fread( buf, 1, bytestoread, 
tsearch_yyin )) == 0) \
-   && ferror( tsearch_yyin ) ) \
-   YY_FATAL_ERROR( "input in flex scanner failed" 
); \
-   } \
-   }
- 
- #define YY_NO_UNPUT
  %}
  
  /* parser's state for parsing defis-word */
--- 1,14 
  %{
+ #include "postgres.h"
  #include 
  #include "deflex.h"
  #include "parser.h"
  
  char *token = NULL;  /* pointer to token */
  char *s = NULL;  /* for returning full defis-word */
  
  YY_BUFFER_STATE buf = NULL; /* buffer to parse; it need for parse from string 
*/
  
  %}
  
  /* parser's state for parsing defis-word */
***
*** 308,320 
BEGIN 

Re: [HACKERS] Changes improve the performance of INSERT and UPDATE

2005-07-16 Thread Tom Lane
Hiroki Kataoka <[EMAIL PROTECTED]> writes:
> This small patch improves the performance of INSERT and UPDATE.  By my
> machine, these changes raised the performance about 5%~10% in pgbench.

BTW, in profiling the backend I've never seen PageAddItem take more than
about 1% of the runtime, and in pgbench in particular it seems to be
down around 0.1% ... so the above seems a bit optimistic ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Changes improve the performance of INSERT and UPDATE

2005-07-16 Thread Tom Lane
Hiroki Kataoka <[EMAIL PROTECTED]> writes:
> This small patch improves the performance of INSERT and UPDATE.  By my
> machine, these changes raised the performance about 5%~10% in pgbench.

Wouldn't it be a lot less invasive to search down from the end, instead
of changing the page header format?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Autovacuum loose ends

2005-07-16 Thread Tom Lane
Ron Mayer <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> ISTM the point of the delay parameters
>> for autovac is to put a lid on its impact on interactive response.  Seen
>> in that light, you do not care exactly which table it's hitting at the
>> moment.

> Unless the table in question takes a big lock when it's VACUUMed
> like tables with GiST indexes do today.

Well, the issue there is not at the table level, but only while the
individual index is being cleaned.

I suggested a few days ago that we ought not do vacuum delays at all
while processing an index that needs an exclusive lock (this no longer
includes gist, but rtree and to a lesser extent hash still have issues).

If you don't like that, I think you'd pretty much have to invent autovac
delays that are tunable on a *per index* basis, not per table.  That
seems a bit over the top to me; it'd be a nontrivial amount of work to
implement, and there's no evidence that it's better than just removing
the vacuum_delay_point calls in rtree and hash.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Changes improve the performance of INSERT and UPDATE

2005-07-16 Thread Hiroki Kataoka

Hi all,

This small patch improves the performance of INSERT and UPDATE.  By my
machine, these changes raised the performance about 5%~10% in pgbench.
Please take a benchmark in a reliable environment.  Since I may forget
some required changes, I specify that this patch is experimental.  But
note that all regression tests have passed.

Thanks,

--
Hiroki Kataoka <[EMAIL PROTECTED]>
diff -ru postgresql-cvs.orig/src/backend/storage/page/bufpage.c 
postgresql-cvs/src/backend/storage/page/bufpage.c
--- postgresql-cvs.orig/src/backend/storage/page/bufpage.c  2005-06-15 
16:53:00.0 +0900
+++ postgresql-cvs/src/backend/storage/page/bufpage.c   2005-07-14 
07:41:11.0 +0900
@@ -42,6 +42,7 @@
p->pd_lower = SizeOfPageHeaderData;
p->pd_upper = pageSize - specialSize;
p->pd_special = pageSize - specialSize;
+   p->pd_insoff = FirstOffsetNumber;
PageSetPageSizeAndVersion(page, pageSize, PG_PAGE_LAYOUT_VERSION);
 }
 
@@ -100,7 +101,7 @@
  * If offsetNumber is valid and <= current max offset in the page,
  * insert item into the array at that position by shuffling ItemId's
  * down to make room.
- * If offsetNumber is not valid, then assign one by finding the first
+ * If offsetNumber is not valid, then assign one by finding the next
  * one that is both unused and deallocated.
  *
  * !!! EREPORT(ERROR) IS DISALLOWED HERE !!!
@@ -120,6 +121,7 @@
OffsetNumber limit;
boolneedshuffle = false;
booloverwritemode = (flags & OverwritePageMode) != 0;
+   boolfindunusedlinp = false;
 
flags &= ~OverwritePageMode;
 
@@ -165,9 +167,15 @@
}
else
{
-   /* offsetNumber was not passed in, so find a free slot */
-   /* look for "recyclable" (unused & deallocated) ItemId */
-   for (offsetNumber = 1; offsetNumber < limit; offsetNumber++)
+   /* offsetNumber was not passed in, try to use pd_insoff */
+   if (OffsetNumberIsValid(phdr->pd_insoff))
+   offsetNumber = phdr->pd_insoff;
+   else
+   offsetNumber = FirstOffsetNumber;
+
+   /* so find a free slot look for "recyclable" (unused & 
deallocated) */
+   /* ItemId */
+   for (; offsetNumber < limit; offsetNumber++)
{
itemId = PageGetItemId(phdr, offsetNumber);
if (((itemId->lp_flags & LP_USED) == 0) &&
@@ -175,6 +183,8 @@
break;
}
/* if no free slot, we'll put it at limit (1st open slot) */
+
+   findunusedlinp = true;
}
 
if (offsetNumber > limit)
@@ -222,6 +232,8 @@
/* adjust page header */
phdr->pd_lower = (LocationIndex) lower;
phdr->pd_upper = (LocationIndex) upper;
+   if (findunusedlinp)
+   phdr->pd_insoff = OffsetNumberNext(offsetNumber);
 
return offsetNumber;
 }
@@ -345,8 +357,13 @@
lp->lp_flags &= ~(LP_USED | LP_DELETE);
if (lp->lp_flags & LP_USED)
nused++;
-   else if (unused)
-   unused[i - nused] = (OffsetNumber) i;
+   else
+   {
+   if (i - nused == 0) /* check if it's first unused 
line pointer */
+   ((PageHeader) page)->pd_insoff = (OffsetNumber) 
i + 1;
+   if (unused)
+   unused[i - nused] = (OffsetNumber) i;
+   }
}
 
if (nused == 0)
diff -ru postgresql-cvs.orig/src/include/storage/bufpage.h 
postgresql-cvs/src/include/storage/bufpage.h
--- postgresql-cvs.orig/src/include/storage/bufpage.h   2005-04-29 
06:47:18.0 +0900
+++ postgresql-cvs/src/include/storage/bufpage.h2005-07-14 
06:57:03.0 +0900
@@ -125,6 +125,7 @@
LocationIndex pd_lower; /* offset to start of free space */
LocationIndex pd_upper; /* offset to end of free space */
LocationIndex pd_special;   /* offset to start of special space */
+   OffsetNumber pd_insoff; /* offset number for next new item */
uint16  pd_pagesize_version;
ItemIdData  pd_linp[1]; /* beginning of line pointer 
array */
 } PageHeaderData;


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Buildfarm issues on specific machines

2005-07-16 Thread Tom Lane
"Pete St. Onge" <[EMAIL PROTECTED]> writes:
> Perhaps this will help in the diagnosis of why REL7_2_STABLE fails on
> arbor (aka caribou). Please let me know if there is anything I can try
> on this side, or if there is any other info you could use.

> [EMAIL PROTECTED]:~$ flex -V
> flex 2.5.31

Ah.  PG 7.2's configure script thinks this version of flex is "too old"
to use, and so ignores it.  It's actually too new --- 2.5.4 is the only
flex version safe to use with older PG releases.  I recall that we have
seen some apparent bugs in 2.5.31; I don't recall at the moment whether
we've worked around all the issues as of recent releases, but we surely
had not as of 7.2.

The short answer is that you should install flex 2.5.4, or else forget
about testing the 7.2 branch.  I don't think anyone will be very
interested in making 7.2 work with flex 2.5.31.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Buildfarm issues on specific machines

2005-07-16 Thread Tom Lane
I spent a little time today cleaning up easily-fixed problems that are
causing buildfarm failures in various back branches.  Hopefully that
will result in a few more "green" entries over the new few days.  While
I was looking, I noticed several machines that seem to be failing
because of local issues:

potoroo [HEAD, 7.4]: lock file "/tmp/.s.PGSQL.65432.lock" already exists

I'm not sure if this is a problem with a stale lock file left around
from an old run, or if it happens because the machine is configured to
try to build/test several branches in parallel.  In any case, it might
be worthwhile to try to hack the buildfarm script so that the Unix
socket files are allocated in a per-branch scratch directory, not in
/tmp.  Or try to change pg_regress to use different port numbers for
different branches?

osprey [HEAD]: could not create shared memory segment: Cannot allocate memory
DETAIL:  Failed system call was shmget(key=2, size=1957888, 03600).

Kernel shmem settings too small...

dragonfly [HEAD]: libz link error

As per earlier discussion, I maintain this is local misconfiguration.

cobra [7.4, 7.3, 7.2]: --with-tcl but no Tk

Possibly adding --without-tk to the configure options is the right answer.
Otherwise, install Tk or remove --with-tcl.

cuckoo [7.3, 7.2]: --enable-nls without OS support

This looks like pilot error; but the later branches don't fail on this
machine, so did we change something in this area?

caribou [7.2]: no "flex" installed

This looks like pilot error as well, though again I don't understand why the
later branches seem to work.  Are we sure the same PATH is being used for
every branch here?  Why doesn't the buildfarm report for 7.2 show the PATH?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-16 Thread Kevin Brown
Bruce Momjian wrote:
> 
> I don't think our problem is partial writes of WAL, which we already
> check, but heap/index page writes, which we currently do not check for
> partial writes.

Hmm...I've read through the thread again and thought about the problem
further, and now think I understand what you're dealing with.

Ultimately, the problem is that you're storing diffs in the WAL, so
you have to be able to guarantee that every data/index page has been
completely written, right?

There's no way to detect a torn page without some sort of marker in
each disk-indivisible segment of the page, unless you're willing to
checksum the entire page.  With that in mind, the method Microsoft
uses for SQL Server is probably about as simple as it gets.  In our
case, I suppose we may as well allocate one byte per 256-bytes segment
for the torn page marker.  Just increment the marker value each time
you write the page (you'll have to read it from the page prior to
incrementing it, of course).

Other than that, torn page detection is really just a special case of
page corruption detection.



-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Autovacuum loose ends

2005-07-16 Thread Ron Mayer

Tom Lane wrote:


ISTM the point of the delay parameters
for autovac is to put a lid on its impact on interactive response.  Seen
in that light, you do not care exactly which table it's hitting at the
moment.


Unless the table in question takes a big lock when it's VACUUMed
like tables with GiST indexes do today.

Slowing down one of those vacuums on a larger table has a huge
impact on interactive responses.

With GiST indexes becoming concurrent I assume Vacuum won't lock
anymore on my tables; but I don't know if there are other index
types or condition that might make vacuums take out similar
table-wide locks.

   Ron

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] 4 pgcrypto regressions failures - 1 unsolved

2005-07-16 Thread Tom Lane
Kris Jurka <[EMAIL PROTECTED]> writes:
> consider what would happen if the shared library didn't exist at all and 
> only a static version were available.  Until this recent batch of pgcrypto 
> changes everything built fine.

Well, the right answer to that really is that pgcrypto ought not try to
link to libz unless a shared libz is available (compare for instance the
situation with plperl and an unshared libperl).  However, I'm not sure
that we could reasonably expect to make a configuration test that would
detect a situation like this --- that is, if we did look for shared
libz, we would find it, and the fact that a nonshared libz in a
different place would cause the actual link to fail seems like something
that configure would be unlikely to be able to realize.

I'm still of the opinion that your libz installation is broken; the fact
that some other products chance not to fail with it is not evidence that
it's OK.  You could for instance have installed both libz.a and libz.so
from the same build in /usr/local/lib, and that would work fine,
independently of the existence of a version in /usr/lib.

Come to think of it, are you sure that the versions in /usr/lib and
/usr/local/lib are even ABI-compatible?  If they are from different zlib
releases, I think you're risking trouble regardless.  Really the right
way to deal with this sort of thing is that you put libz.a and libz.so
in /usr/local/lib and corresponding headers in /usr/local/include, and
then you don't need to sweat whether they are exactly compatible with
what appears in /usr/lib and /usr/include.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] 4 pgcrypto regressions failures - 1 unsolved

2005-07-16 Thread Kris Jurka


On Sat, 16 Jul 2005, Tom Lane wrote:

> Kris Jurka <[EMAIL PROTECTED]> writes:
> 
> > The link line says -L/usr/local/lib -lz and libz.a is in /usr/local/lib 
> > while libz.so is in /usr/lib.
> 
> Well, that is a flat-out configuration error on the local sysadmin's
> part.  I can't think of any good reason for the .so and .a versions of a
> library to live in different places.  We certainly shouldn't hack our
> build process to build deliberately-inefficient object files in order to
> accommodate such a setup.
> 

Well the OS only came with the shared library and I needed the static one
for some reason, so I installed it alone under /usr/local.  This works 
fine with Sun's cc and Marko's research indicates that this will also 
work fine using GNU ld instead of Sun's ld.  This is certainly an unusual 
thing to do, but I don't believe it is a flat-out configuration error, 
consider what would happen if the shared library didn't exist at all and 
only a static version were available.  Until this recent batch of pgcrypto 
changes everything built fine.

Kris Jurka

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 4 pgcrypto regressions failures - 1 unsolved

2005-07-16 Thread Tom Lane
Kris Jurka <[EMAIL PROTECTED]> writes:
> On Sat, 16 Jul 2005, Tom Lane wrote:
>> This sure seems like a crude band-aid rather than an actual solution.
>> The bug as I see it is that gcc is choosing to link libz.a rather than
>> libz.so --- why is that happening?

> The link line says -L/usr/local/lib -lz and libz.a is in /usr/local/lib 
> while libz.so is in /usr/lib.

Well, that is a flat-out configuration error on the local sysadmin's
part.  I can't think of any good reason for the .so and .a versions of a
library to live in different places.  We certainly shouldn't hack our
build process to build deliberately-inefficient object files in order to
accommodate such a setup.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] 4 pgcrypto regressions failures - 1 unsolved

2005-07-16 Thread Kris Jurka


On Sat, 16 Jul 2005, Tom Lane wrote:

> Marko Kreen  writes:
> > I googled a bit and found two suggestions:
> > 
> > 1. http://curl.haxx.se/mail/lib-2002-01/0092.html
> > (Use -mimpure-text on linking line)
> > 
> This sure seems like a crude band-aid rather than an actual solution.
> The bug as I see it is that gcc is choosing to link libz.a rather than
> libz.so --- why is that happening?
> 

The link line says -L/usr/local/lib -lz and libz.a is in /usr/local/lib 
while libz.so is in /usr/lib.

Kris Jurka

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pg_get_prepared?

2005-07-16 Thread Kris Jurka


On Sat, 16 Jul 2005, Christopher Kings-Lynne wrote:

> > This has been covered before, but to reiterate: why would you need this?
> > Any application worth its salt should be tracking which statements it
> > has already prepared (after all, they cannot span connections). Seems
> > a waste of resources to make a separate call to the database for
> > information you should already know.
> 
> Erm, websites...use persistent connections...you have no idea if you're 
> dealing with a new connection or a reused one, and if the statement is 
> prepared or not.
> 

I think the point is that this is the driver's problem, not the 
applications.  If you are using SQL level PREPARE/EXECUTE in your code 
that's your problem, but if you are using an api like:

$stmt = $conn->prepare("SELECT * FROM tab WHERE x = ?");
$result = $stmt->execute(71);

Then the driver itself should know if the above query has been prepared 
previously and further what type it has been prepared for so that it can 
cast the 71 or prepare a new statement.

Kris Jurka

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] 4 pgcrypto regressions failures - 1 unsolved

2005-07-16 Thread Tom Lane
Marko Kreen  writes:
> On Tue, Jul 12, 2005 at 01:06:46PM -0500, Kris Jurka wrote:
>>> Well the buildfarm machine kudu is actually the same machine just building 
>>> with the Sun compiler and it works fine.  It links all of libz.a into 
>>> libpgcrypto.so while gcc refuses to.
> 
> I googled a bit and found two suggestions:
> 
> 1. http://curl.haxx.se/mail/lib-2002-01/0092.html
> (Use -mimpure-text on linking line)
> 
> The attached patch does #1.  Could you try it and see if it fixes it?

This sure seems like a crude band-aid rather than an actual solution.
The bug as I see it is that gcc is choosing to link libz.a rather than
libz.so --- why is that happening?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Autovacuum name

2005-07-16 Thread Rod Taylor
On Sat, 2005-07-16 at 08:17 -0400, Bruce Momjian wrote:
> Rod Taylor wrote:
> > This may sound silly, but any chance we could change autovacuum_* GUC
> > variables to be vacuum_auto_* instead?
> > 
> > This way when you issue a SHOW ALL, all of the vacuum related parameters
> > would be in the same place.
> 
> Well, the autovacuum items control just autovacuum, while vacuum control
> user vacuums as well.  I think they are best separate.


Users can (and I still do) log rotation by hand but the GUC variables
for that are not named autologrotate_*.

You say that like they're two different things and like most users are
going to continue to use regular vacuum. The only reason not to use the
automated vacuum is because of bugs or lack of features, both of which
will be fixed over the next couple of releases. By the time version 8.3
rolls around the use of manual vacuum will likely be very rare and we
will be stuck with GUC names that are there for historical purposes
only.

-- 


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Autovacuum name

2005-07-16 Thread Bruce Momjian
Rod Taylor wrote:
> This may sound silly, but any chance we could change autovacuum_* GUC
> variables to be vacuum_auto_* instead?
> 
> This way when you issue a SHOW ALL, all of the vacuum related parameters
> would be in the same place.

Well, the autovacuum items control just autovacuum, while vacuum control
user vacuums as well.  I think they are best separate.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-16 Thread Bruce Momjian

I don't think our problem is partial writes of WAL, which we already
check, but heap/index page writes, which we currently do not check for
partial writes.

---

Kevin Brown wrote:
> Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > I don't think we should care too much about indexes. We can rebuild
> > > them...but losing heap sectors means *data loss*.
> > 
> > If you're so concerned about *data loss* then none of this will be
> > acceptable to you at all.  We are talking about going from a system
> > that can actually survive torn-page cases to one that can only tell
> > you whether you've lost data to such a case.  Arguing about the
> > probability with which we can detect the loss seems beside the
> > point.
> 
> I realize I'm coming into this discussion a bit late, and perhaps my
> thinking on this is simplistically naive.  That said, I think I have
> an idea of how to solve the torn page problem.
> 
> If the hardware lies to you about the data being written to the disk,
> then no amount of work on our part can guarantee data integrity.  So
> the below assumes that the hardware doesn't ever lie about this.
> 
> If you want to prevent a torn page, you have to make the last
> synchronized write to the disk as part of the checkpoint process a
> write that *cannot* result in a torn page.  So it has to be a write of
> a buffer that is no larger than the sector size of the disk.  I'd make
> it 256 bytes, to be sure of accomodating pretty much any disk hardware
> out there.
> 
> In any case, the modified sequence would go something like:
> 
> 1.  write the WAL entry, and encode in it a unique magic number
> 2.  sync()
> 3.  append the unique magic number to the WAL again (or to a separate
> file if you like, it doesn't matter as long as you know where to
> look for it during recovery), using a 256 byte (at most) write
> buffer.
> 4.  sync()
> 
> 
> After the first sync(), the OS guarantees that the data you've written
> so far is committed to the platters, with the possible exception of a
> torn page during the write operation, which will only happen during a
> crash during step 2.  But if a crash happens here, then the second
> occurrance of the unique magic number will not appear in the WAL (or
> separate file, if that's the mechanism chosen), and you'll *know* that
> you can't trust that the WAL entry was completely committed to the
> platter.
> 
> If a crash happens during step 4, then either the appended magic
> number won't appear during recovery, in which case the recovery
> process can assume that the WAL entry is incomplete, or it will
> appear, in which case it's *guaranteed by the hardware* that the WAL
> entry is complete, because you'll know for sure that the previous
> sync() completed successfully.
> 
> 
> The amount of time between steps 2 and 4 should be small enough that
> there should be no significant performance penalty involved, relative
> to the time it takes for the first sync() to complete.
> 
> 
> Thoughts?
> 
> 
> 
> -- 
> Kevin Brown [EMAIL PROTECTED]
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 4 pgcrypto regressions failures - 1 unsolved

2005-07-16 Thread Marko Kreen
On Fri, Jul 15, 2005 at 08:06:15PM -0500, Kris Jurka wrote:
> On Fri, 15 Jul 2005, Marko Kreen wrote:
> 
> > [buildfarm machine dragonfly]
> > 
> > On Tue, Jul 12, 2005 at 01:06:46PM -0500, Kris Jurka wrote:
> > > Well the buildfarm machine kudu is actually the same machine just 
> > > building 
> > > with the Sun compiler and it works fine.  It links all of libz.a into 
> > > libpgcrypto.so while gcc refuses to.
> > 
> > I googled a bit and found two suggestions:
> > 
> > 1. http://curl.haxx.se/mail/lib-2002-01/0092.html
> >(Use -mimpure-text on linking line)
> > 
> > The attached patch does #1.  Could you try it and see if it fixes it?
> > 
> 
> This patch works, pgcrypto links and passes its installcheck test now.
> 
> Kris Jurka

Thanks.

Here is the patch with a little comment.

It should not break anything as it just disables a extra
argument "-assert pure-text" to linker.

Linking static libraries into shared one is bad idea, as the
static parts wont be shared between processes, but erroring
out is worse, especially if another compiler for a platform
allows it.

This makes gcc act same way as Sun's cc.

-- 
marko

Index: src/Makefile.shlib
===
RCS file: /opt/arc/cvs2/pgsql/src/Makefile.shlib,v
retrieving revision 1.95
diff -u -c -r1.95 Makefile.shlib
*** src/Makefile.shlib  13 Jul 2005 17:00:44 -  1.95
--- src/Makefile.shlib  16 Jul 2005 09:59:18 -
***
*** 188,194 
  
  ifeq ($(PORTNAME), solaris)
ifeq ($(GCC), yes)
! LINK.shared   = $(CC) -shared
else
  LINK.shared   = $(CC) -G
endif
--- 188,196 
  
  ifeq ($(PORTNAME), solaris)
ifeq ($(GCC), yes)
! # -mimpure-text disables passing '-assert pure-text' to linker,
! # to allow linking static library into shared one, like Sun's cc does.
! LINK.shared   = $(CC) -shared -mimpure-text
else
  LINK.shared   = $(CC) -G
endif

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pg_get_prepared?

2005-07-16 Thread Mario Weilguni
Am Freitag, 15. Juli 2005 14:19 schrieb Greg Sabino Mullane:
> > The use case is when you want to prepare a query, but only if it's not
> > already prepared on that connection.
>
> This has been covered before, but to reiterate: why would you need this?
> Any application worth its salt should be tracking which statements it
> has already prepared (after all, they cannot span connections). Seems
> a waste of resources to make a separate call to the database for
> information you should already know.

Does not apply to mod_php/apache, you simply do not know if a connection made 
my pg_pconnect is a new connection or a reused one. That has nothing to do 
with the application itself.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster