Re: [HACKERS] Mismatch in libpqwalreceiver

2010-03-24 Thread Heikki Linnakangas
Fujii Masao wrote:
 On Wed, Mar 24, 2010 at 1:49 AM, Hitoshi Harada umi.tan...@gmail.com wrote:
 It seems this commit forgot README fix.

 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/replication/walreceiver.h?r1=1.5r2=1.6
 
 Thanks for the report!

Yep, thanks. Applied, and I also added description of the returned values.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: preloading of ispell dictionary

2010-03-24 Thread Pavel Stehule
2010/3/24 Craig Ringer cr...@postnewspapers.com.au:
 Pavel Stehule wrote:

 Personally I dislike idea some dictionary precompiler - it is next
 application for maintaining and maybe not necessary.

 That's the sort of thing that can be done when first required by any
 backend and the results saved in a file for other backends to mmap().
 It'd probably want to be opened r/w access-exclusive initially, then
 re-opened read-only access-shared when ready for use.

 My only concern would be that the cache would want to be forcibly
 cleared at postmaster start, so that restart the postmaster fixes any
 messsed-up-cache issues that might arise (not that they should) without
 people having to go rm'ing in the datadir. Even if Pg never has any bugs
 that result in bad cache files, the file system / bad memory / cosmic
 rays / etc can still mangle a cache file.

 BTW, mmap() isn't an issue on Windows:
  http://msdn.microsoft.com/en-us/library/aa366556%28VS.85%29.aspx
 It's spelled CreateFileMapping, but otherwise is fairly similar, and is
 perfect for this sort of use.

 A shared read-only mapping of processed-and-cached tsearch2 dictionaries
 would save a HUGE amount of memory if many backends were using tsearch2
 at the same time. I'd make a big difference here.


If you know this area well, please, enhance my first patch. I am not
able to oppose to Tom, who has a clean opinion on this patch :(

Pavel


 --
 Craig Ringer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add connection messages for streaming replication.

2010-03-24 Thread Fujii Masao
On Wed, Mar 24, 2010 at 2:25 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The main thing for me was that it logged something. The above two ways
 occurred to me and figured we'd end up discussing it.

 The first way is slightly confusing for the reason stated, agreed. By
 using the same form of words as is used currently, all existing scripts
 that search for connection details will all still work. The second way
 is more informative, if you don't know replication is a
 pseudo-database, but it will break all existing scripts.

 My own feeling was that breaking existing scripts was not a price worth
 paying for the extra information in the second form of the message,
 since its just the same words re-arranged.

Since the meaning of the first message is different between 8.4 and 9.0
(in 8.4, the normal connection to the database 'replication', in 9.0, the
connection for replication from the standby server), we would still need
to change the existing scripts. No?

What is worse is that we can connect to the real database 'replication'
in 9.0. So we might be unable to discern that normal connection from the
replication connection by seeing the first message.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add connection messages for streaming replication.

2010-03-24 Thread Simon Riggs
On Wed, 2010-03-24 at 17:36 +0900, Fujii Masao wrote:
 On Wed, Mar 24, 2010 at 2:25 PM, Simon Riggs si...@2ndquadrant.com wrote:
  The main thing for me was that it logged something. The above two ways
  occurred to me and figured we'd end up discussing it.
 
  The first way is slightly confusing for the reason stated, agreed. By
  using the same form of words as is used currently, all existing scripts
  that search for connection details will all still work. The second way
  is more informative, if you don't know replication is a
  pseudo-database, but it will break all existing scripts.
 
  My own feeling was that breaking existing scripts was not a price worth
  paying for the extra information in the second form of the message,
  since its just the same words re-arranged.
 
 Since the meaning of the first message is different between 8.4 and 9.0
 (in 8.4, the normal connection to the database 'replication', in 9.0, the
 connection for replication from the standby server), we would still need
 to change the existing scripts. No?
 
 What is worse is that we can connect to the real database 'replication'
 in 9.0. So we might be unable to discern that normal connection from the
 replication connection by seeing the first message.

So we are allowing a database to be called REPLICATION? Surely there
are some significant problems in that case. How will access control to
that database work in the pg_hba.conf?

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add connection messages for streaming replication.

2010-03-24 Thread Fujii Masao
On Wed, Mar 24, 2010 at 7:29 PM, Simon Riggs si...@2ndquadrant.com wrote:
 So we are allowing a database to be called REPLICATION?

Yes.

 Surely there
 are some significant problems in that case. How will access control to
 that database work in the pg_hba.conf?

We can do that by enclosing the database field of pg_hba.conf in double
quotes as follows.

TYPE   DATABASEUSER  CIDR-ADDRESS  METHOD
host   replication   foo   192.168.0.5   md5

In pg_hba.conf, double-quoted keyword like all, sameuser, samerole
or replication matches a database with that name.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add connection messages for streaming replication.

2010-03-24 Thread Simon Riggs
On Wed, 2010-03-24 at 19:49 +0900, Fujii Masao wrote:
 On Wed, Mar 24, 2010 at 7:29 PM, Simon Riggs si...@2ndquadrant.com wrote:
  So we are allowing a database to be called REPLICATION?
 
 Yes.
 
  Surely there
  are some significant problems in that case. How will access control to
  that database work in the pg_hba.conf?
 
 We can do that by enclosing the database field of pg_hba.conf in double
 quotes as follows.
 
 TYPE   DATABASEUSER  CIDR-ADDRESS  METHOD
 host   replication   foo   192.168.0.5   md5
 
 In pg_hba.conf, double-quoted keyword like all, sameuser, samerole
 or replication matches a database with that name.

So we might have a pg_hba.conf that looks like this

TYPE   DATABASEUSER  CIDR-ADDRESS  METHOD
host   replication   foo   192.168.0.5   md5
host   replication foo   192.168.0.5   md5

Which looks pretty strange.
I think we should change that, though if not we should at least document
it.

That probably tips the balance towards having the alternate wording:
LOG:  replication connection authorized: user=foo



It also highlights another problem: it's possible to have database names
that contain spaces. There are no double quotes around most of the
things that get logged. So if we do

CREATE DATABASE oh my god;

will cause things like this to be logged without quotes

LOG:  connection authorized: user=foo database=oh my god

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Performance Improvement for Unique Indexes

2010-03-24 Thread Gokulakannan Somasundaram
Hi,
   While i was studying the unique index checks very closely, i realized
that what we need is to find out whether the tuple is deleted / not. So say
a tuple is deleted by a transaction, but it is not dead( because of some
long running transaction ), still we can mark a hint bit as deleted and it
will help the subsequent transactions doing the unique checks. As a matter
of fact, it will help the deferred_unique cases, since it will anyway check
the tuples twice, if there is a duplicate.
   So we have one bit left in the Index Tuple that can be used as hint bit.
If we are ready to break the disk compatibility, then we can store the size
as a multiple of 8, and we will get three bits free. Any comments?

Thanks,
Gokul.


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add connection messages for streaming replication.

2010-03-24 Thread Heikki Linnakangas
Simon Riggs wrote:
 So we might have a pg_hba.conf that looks like this
 
 TYPE   DATABASEUSER  CIDR-ADDRESS  METHOD
 host   replication   foo   192.168.0.5   md5
 host   replication foo   192.168.0.5   md5
 
 Which looks pretty strange.
 I think we should change that, though if not we should at least document
 it.

The default pg_hba.conf says:

 # Database and user names containing spaces, commas, quotes and other
 # special characters must be quoted.  Quoting one of the keywords
 # all, sameuser, samerole or replication makes the name lose
 # its special character, and just match a database or username with
 # that name.

but I don't see any mention of that in the docs. How about:

*** client-auth.sgml24 Mar 2010 09:44:06 +0200  1.134
--- client-auth.sgml24 Mar 2010 13:21:16 +0200  
***
*** 77,84 
 a set of records, one per line. Blank lines are ignored, as is any
 text after the literal#/literal comment character. A record is made
 up of a number of fields which are separated by spaces and/or tabs.
!Fields can contain white space if the field value is quoted. Records
!cannot be continued across lines.
/para

para
--- 77,87 
 a set of records, one per line. Blank lines are ignored, as is any
 text after the literal#/literal comment character. A record is made
 up of a number of fields which are separated by spaces and/or tabs.
!Fields can contain white space if the field value is quoted.
!Quoting one of the keywords in database or username field (e.g all
!or replication) makes the name lose its special character, and just
!match a database or username with that name. Records cannot be
!continued across lines.
/para

para


 That probably tips the balance towards having the alternate wording:
 LOG:  replication connection authorized: user=foo

+1

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PHONY targets in Makefile.global.in

2010-03-24 Thread Fujii Masao
Hi,

Why aren't installcheck-parallel, world, install-world and
installcheck-world declared as a PHONY target in Makefile.global.in?
The attached patch does that.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


phony_targets_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add connection messages for streaming replication.

2010-03-24 Thread Fujii Masao
On Wed, Mar 24, 2010 at 8:22 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 but I don't see any mention of that in the docs. How about:

 *** client-auth.sgml    24 Mar 2010 09:44:06 +0200      1.134
 --- client-auth.sgml    24 Mar 2010 13:21:16 +0200
 ***
 *** 77,84 
     a set of records, one per line. Blank lines are ignored, as is any
     text after the literal#/literal comment character. A record is made
     up of a number of fields which are separated by spaces and/or tabs.
 !    Fields can contain white space if the field value is quoted. Records
 !    cannot be continued across lines.
    /para

    para
 --- 77,87 
     a set of records, one per line. Blank lines are ignored, as is any
     text after the literal#/literal comment character. A record is made
     up of a number of fields which are separated by spaces and/or tabs.
 !    Fields can contain white space if the field value is quoted.
 !    Quoting one of the keywords in database or username field (e.g all
 !    or replication) makes the name lose its special character, and just
 !    match a database or username with that name. Records cannot be
 !    continued across lines.
    /para

+1

 That probably tips the balance towards having the alternate wording:
 LOG:  replication connection authorized: user=foo

 +1

+1

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add connection messages for streaming replication.

2010-03-24 Thread Simon Riggs
On Wed, 2010-03-24 at 20:30 +0900, Fujii Masao wrote:
 On Wed, Mar 24, 2010 at 8:22 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
  but I don't see any mention of that in the docs. How about:

 +1

Yes, plus a mention in the rep docs.

  That probably tips the balance towards having the alternate wording:
  LOG:  replication connection authorized: user=foo
 
  +1
 
 +1

Will do.


What's the word on when you guys will be finished with the open items
list for SR? Feels like we're both using the other one as an excuse not
to complete our remaining actions.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement for Unique Indexes

2010-03-24 Thread Gokulakannan Somasundaram


 How are you going to unmark the hint bit in case of a rollback?


Only after you find that the transaction is committed, this hint bit has to
be set. It is equivalent to any other hint bit.

Gokul.


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-24 Thread Heikki Linnakangas
Fujii Masao wrote:
 But in the current (v8.4 or before) behavior, recovery ends normally
 when an invalid record is found in an archived WAL file. Otherwise,
 the server would never be able to start normal processing when there
 is a corrupted archived file for some reasons. So, that invalid record
 should not be treated as a PANIC if the server is not in standby mode
 or the trigger file has been created. Thought?

Hmm, true, this changes behavior over previous releases. I tend to think
that it's always an error if there's a corrupt file in the archive,
though, and PANIC is appropriate. If the administrator wants to start up
the database anyway, he can remove the corrupt file from the archive and
place it directly in pg_xlog instead.

 When I tested the patch, the following PANIC error was thrown in the
 normal archive recovery. This seems to derive from the above change.
 The detail error sequence:
 1. In ReadRecord(), emode was set to PANIC after 0001000B
was read.
 2. 0001000C including the contrecord tried to be read
by using the emode (= PANIC). But since 0001000C did
not exist, PANIC error was thrown.
 
 -
 LOG:  restored log file 0001000B from archive
 cp: cannot stat `../data.arh/0001000C': No such file
 or directory
 PANIC:  could not open file pg_xlog/0001000C (log
 file 0, segment 12): No such file or directory
 LOG:  startup process (PID 17204) was terminated by signal 6: Aborted
 LOG:  terminating any other active server processes
 -

Thanks. That's easily fixable (applies over the previous patch):

--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3773,7 +3773,7 @@ retry:
pagelsn.xrecoff = 0;
}
/* Wait for the next page to become available */
-   if (!XLogPageRead(pagelsn, emode, false, false))
+   if (!XLogPageRead(pagelsn, emode_arg, false, false))
return NULL;

/* Check that the continuation record looks valid */

Perhaps the emode/emode_arg convention is a bit hard to read.

I'll go through the patch myself once more, and commit later today or
tomorrow if now new issues crop up.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add connection messages for streaming replication.

2010-03-24 Thread Fujii Masao
On Wed, Mar 24, 2010 at 8:52 PM, Simon Riggs si...@2ndquadrant.com wrote:
 What's the word on when you guys will be finished with the open items
 list for SR?

Sorry, I'm not sure when.

Now, I'm trying to address the open item Walreceiver is not
interruptible on win32.
It might take time to create the patch since I'm not familiar with
tweaking Makefile
or something.
http://archives.postgresql.org/pgsql-hackers/2010-03/msg00413.php

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add connection messages for streaming replication.

2010-03-24 Thread Craig Ringer

On 24/03/2010 6:29 PM, Simon Riggs wrote:


So we are allowing a database to be called REPLICATION? Surely there
are some significant problems in that case. How will access control to
that database work in the pg_hba.conf?


Surely it should be consistent with template0 and postgres:

template1=# create database postgres;
ERROR:  database postgres already exists
template1=# create database postgres;
ERROR:  database postgres already exists
template1=# create database POSTGRES;
CREATE DATABASE
template1=# create database template0;
ERROR:  database template0 already exists
template1=# create database template0;
ERROR:  database template0 already exists
template1=# create database TEMPLATE0;
CREATE DATABASE

--
Craig Ringer

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PHONY targets in Makefile.global.in

2010-03-24 Thread Andrew Dunstan



Fujii Masao wrote:

Hi,

Why aren't installcheck-parallel, world, install-world and
installcheck-world declared as a PHONY target in Makefile.global.in?
  


Lack of make-fu, probably.


The attached patch does that.

  


Thanks. Applied.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-24 Thread Fujii Masao
On Wed, Mar 24, 2010 at 9:31 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Hmm, true, this changes behavior over previous releases. I tend to think
 that it's always an error if there's a corrupt file in the archive,
 though, and PANIC is appropriate. If the administrator wants to start up
 the database anyway, he can remove the corrupt file from the archive and
 place it directly in pg_xlog instead.

Okay.

 Thanks. That's easily fixable (applies over the previous patch):

 --- a/src/backend/access/transam/xlog.c
 +++ b/src/backend/access/transam/xlog.c
 @@ -3773,7 +3773,7 @@ retry:
                pagelsn.xrecoff = 0;
            }
            /* Wait for the next page to become available */
 -           if (!XLogPageRead(pagelsn, emode, false, false))
 +           if (!XLogPageRead(pagelsn, emode_arg, false, false))
                return NULL;

            /* Check that the continuation record looks valid */

Seems correct.

 sources = ~failedSources;
 failedSources |= readSource;

The above lines in XLogPageRead() seem not to be required in normal
recovery case (i.e., standby_mode = off). So how about the attached
patch?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


failedSource_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-24 Thread Fujii Masao
On Wed, Mar 24, 2010 at 10:20 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Thanks. That's easily fixable (applies over the previous patch):

 --- a/src/backend/access/transam/xlog.c
 +++ b/src/backend/access/transam/xlog.c
 @@ -3773,7 +3773,7 @@ retry:
                pagelsn.xrecoff = 0;
            }
            /* Wait for the next page to become available */
 -           if (!XLogPageRead(pagelsn, emode, false, false))
 +           if (!XLogPageRead(pagelsn, emode_arg, false, false))
                return NULL;

            /* Check that the continuation record looks valid */

 Seems correct.

On second thought, the following lines seem to be necessary just after
calling XLogPageRead() since it reads new WAL file from another source.

   if (readSource == XLOG_FROM_STREAM || readSource == XLOG_FROM_ARCHIVE)
   emode = PANIC;
   else
   emode = emode_arg;

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: preloading of ispell dictionary

2010-03-24 Thread Bruce Momjian
Pavel Stehule wrote:
 2010/3/24 Craig Ringer cr...@postnewspapers.com.au:
  Pavel Stehule wrote:
 
  Personally I dislike idea some dictionary precompiler - it is next
  application for maintaining and maybe not necessary.
 
  That's the sort of thing that can be done when first required by any
  backend and the results saved in a file for other backends to mmap().
  It'd probably want to be opened r/w access-exclusive initially, then
  re-opened read-only access-shared when ready for use.
 
  My only concern would be that the cache would want to be forcibly
  cleared at postmaster start, so that restart the postmaster fixes any
  messsed-up-cache issues that might arise (not that they should) without
  people having to go rm'ing in the datadir. Even if Pg never has any bugs
  that result in bad cache files, the file system / bad memory / cosmic
  rays / etc can still mangle a cache file.
 
  BTW, mmap() isn't an issue on Windows:
  ?http://msdn.microsoft.com/en-us/library/aa366556%28VS.85%29.aspx
  It's spelled CreateFileMapping, but otherwise is fairly similar, and is
  perfect for this sort of use.
 
  A shared read-only mapping of processed-and-cached tsearch2 dictionaries
  would save a HUGE amount of memory if many backends were using tsearch2
  at the same time. I'd make a big difference here.
 
 
 If you know this area well, please, enhance my first patch. I am not
 able to oppose to Tom, who has a clean opinion on this patch :(

Should we add a TODO?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: preloading of ispell dictionary

2010-03-24 Thread Pavel Stehule
2010/3/24 Bruce Momjian br...@momjian.us:
 Pavel Stehule wrote:
 2010/3/24 Craig Ringer cr...@postnewspapers.com.au:
  Pavel Stehule wrote:
 
  Personally I dislike idea some dictionary precompiler - it is next
  application for maintaining and maybe not necessary.
 
  That's the sort of thing that can be done when first required by any
  backend and the results saved in a file for other backends to mmap().
  It'd probably want to be opened r/w access-exclusive initially, then
  re-opened read-only access-shared when ready for use.
 
  My only concern would be that the cache would want to be forcibly
  cleared at postmaster start, so that restart the postmaster fixes any
  messsed-up-cache issues that might arise (not that they should) without
  people having to go rm'ing in the datadir. Even if Pg never has any bugs
  that result in bad cache files, the file system / bad memory / cosmic
  rays / etc can still mangle a cache file.
 
  BTW, mmap() isn't an issue on Windows:
  ?http://msdn.microsoft.com/en-us/library/aa366556%28VS.85%29.aspx
  It's spelled CreateFileMapping, but otherwise is fairly similar, and is
  perfect for this sort of use.
 
  A shared read-only mapping of processed-and-cached tsearch2 dictionaries
  would save a HUGE amount of memory if many backends were using tsearch2
  at the same time. I'd make a big difference here.
 

 If you know this area well, please, enhance my first patch. I am not
 able to oppose to Tom, who has a clean opinion on this patch :(

 Should we add a TODO?

why not ?

Pavel

 --
  Bruce Momjian  br...@momjian.us        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement for Unique Indexes

2010-03-24 Thread Tom Lane
Gokulakannan Somasundaram gokul...@gmail.com writes:
While i was studying the unique index checks very closely, i realized
 that what we need is to find out whether the tuple is deleted / not. So say
 a tuple is deleted by a transaction, but it is not dead( because of some
 long running transaction ), still we can mark a hint bit as deleted and it
 will help the subsequent transactions doing the unique checks. As a matter
 of fact, it will help the deferred_unique cases, since it will anyway check
 the tuples twice, if there is a duplicate.

It seems fairly unlikely to me that this would be useful enough to
justify using up a precious hint bit.  The applicability of the hint
is very short-term --- as soon as the tuple is dead to all transactions,
it can be marked with the existing LP_DEAD hint bit.  And if it's only
useful for uniqueness checks, as seems to be the case, that's another
big restriction on the value.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: preloading of ispell dictionary

2010-03-24 Thread Bruce Momjian
Pavel Stehule wrote:
 2010/3/24 Bruce Momjian br...@momjian.us:
  Pavel Stehule wrote:
  2010/3/24 Craig Ringer cr...@postnewspapers.com.au:
   Pavel Stehule wrote:
  
   Personally I dislike idea some dictionary precompiler - it is next
   application for maintaining and maybe not necessary.
  
   That's the sort of thing that can be done when first required by any
   backend and the results saved in a file for other backends to mmap().
   It'd probably want to be opened r/w access-exclusive initially, then
   re-opened read-only access-shared when ready for use.
  
   My only concern would be that the cache would want to be forcibly
   cleared at postmaster start, so that restart the postmaster fixes any
   messsed-up-cache issues that might arise (not that they should) without
   people having to go rm'ing in the datadir. Even if Pg never has any bugs
   that result in bad cache files, the file system / bad memory / cosmic
   rays / etc can still mangle a cache file.
  
   BTW, mmap() isn't an issue on Windows:
   ?http://msdn.microsoft.com/en-us/library/aa366556%28VS.85%29.aspx
   It's spelled CreateFileMapping, but otherwise is fairly similar, and is
   perfect for this sort of use.
  
   A shared read-only mapping of processed-and-cached tsearch2 dictionaries
   would save a HUGE amount of memory if many backends were using tsearch2
   at the same time. I'd make a big difference here.
  
 
  If you know this area well, please, enhance my first patch. I am not
  able to oppose to Tom, who has a clean opinion on this patch :(
 
  Should we add a TODO?
 
 why not ?

OK, what would the TODO text be?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: preloading of ispell dictionary

2010-03-24 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 OK, what would the TODO text be?

I think there are really two tasks here:

* preprocess the textual dictionary definition files into something
that can be slurped directly into memory;

* use mmap() instead of read() to read preprocessed files into memory,
on machines where such a syscall is available.

There would be considerable gain from task #1 even without mmap.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement for Unique Indexes

2010-03-24 Thread Gokulakannan Somasundaram
 it seems fairly unlikely to me that this would be useful enough to
 justify using up a precious hint bit.  The applicability of the hint
 is very short-term --- as soon as the tuple is dead to all transactions,
 it can be marked with the existing LP_DEAD hint bit.  And if it's only
 useful for uniqueness checks, as seems to be the case, that's another
 big restriction on the value.

 Right. It is of little value.

Gokul.


[HACKERS] last_statrequest is in the future

2010-03-24 Thread Tom Lane
Well, I didn't actually think that this patch
http://archives.postgresql.org/pgsql-committers/2010-03/msg00181.php
would yield much insight, but lookee what we have here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguardt=2010-03-24%2004:00:07

[4ba99150.5099:483] LOG:  statement: VACUUM ANALYZE num_exp_add;
[4ba99145.5071:1] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:2] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:3] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:4] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:5] LOG:  last_statrequest is in the future, resetting
...
[4ba99145.5071:497] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:498] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:499] LOG:  last_statrequest is in the future, resetting
[4ba99145.5071:500] LOG:  last_statrequest is in the future, resetting
[4ba99150.5099:484] WARNING:  pgstat wait timeout

There are multiple occurrences of pgstat wait timeout in the
postmaster log (some evidently from autovacuum, because they don't show
up as regression diffs), and every one of them is associated with a
bunch of last_statrequest is in the future bleats.

So at least on jaguar, it seems that the reason for this behavior is
that the system clock is significantly skewed between the stats
collector process and the backends, to the point where stats updates
generated by the collector will never appear new enough to satisfy the
requesting backends.  I think I'm going to go back and modify the code
to show the actual numbers involved so we can see just how bad it is ---
but the skew must be more than five seconds or we'd not be seeing this
failure.  That seems to me to put it in the class of system bug.

Should we redesign the stats signaling logic to work around this,
or just hope we can nag kernel people into fixing it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dtester-0.1 released

2010-03-24 Thread Steve Singer


Markus Wanner wrote:

 Hi,

 A git repository for dtester as well as some integration code for
 testing Postgres based projects is available at:
 http://git.postgres-r.org/


Markus,

I'm having some issues pulling from your git repository

$ git fetch postgres-dtest

fatal: http://git.postgres-r.org/postgres-dtests/info/refs not found: 
did you run git update-server-info on the server?


$ git clone http://git.postgres-r.org/dtester
Initialized empty Git repository in 
/local/home/ssinger/src/dtester/dtester/.git/
fatal: http://git.postgres-r.org/dtester/info/refs download error - The 
requested URL returned error: 500


Is your git server having issues or am I trying to get at the code in 
the wrong way.


Thanks




Regards

Markus Wanner





--
Steve Singer
Afilias Canada
Data Services Developer
416-673-1142

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: xmlconcat (was [HACKERS] 9.0 release notes done)

2010-03-24 Thread Andrew Dunstan



Peter Eisentraut wrote:

On mån, 2010-03-22 at 19:38 -0400, Andrew Dunstan wrote:
  

But if we are not comfortable about being able to do that safely, I
would be OK with just raising an error if a concatenation is
  

attempted


where one value contains a DTD.  The impact in practice should be
  

low.

  
  
Right. Can you find a way to do that using the libxml API? I haven't 
managed to, and I'm pretty sure I can construct XML that fails every 
simple string search test I can think of, either with a false negative

or a false positive.



The documentation on that is terse as usual.  In any case, you will need
to XML parse the input values, and so you might as well resort to
parsing the output value to see if it is well-formed, which should catch
this mistake and possibly others.

  


Actually, I have come to the conclusion that the biggest problem in this 
area is that we accept XML documents with a leading DOCTYPE node at all. 
Our docs state:


   The xml type can store well-formed documents, as defined by the
   XML standard, as well as content fragments, which are defined by
   the production XMLDecl? content in the XML standard.

A document with a leading DOCTYPE node matches neither of these rules, 
and when we strip the XMLDecl from a piece of XML where it's followed by 
a DOCTYPE node we turn something that is legal XML into something that 
isn't, even by our own (or possibly the standard's) relaxed definition. 
A doctypedecl can only follow an  XMLDecl, see 
http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-prolog-dtd.


So I think we need to go back to the drawing board a bit, rather than 
patch a particular reported error case. But these problems are not at 
all new to 9.0, and coming up to beta as I hope we are is not the time 
for it. I think it will have to wait to 9.1.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: xmlconcat (was [HACKERS] 9.0 release notes done)

2010-03-24 Thread Peter Eisentraut
On ons, 2010-03-24 at 14:51 -0400, Andrew Dunstan wrote:
 Actually, I have come to the conclusion that the biggest problem in
 this 
 area is that we accept XML documents with a leading DOCTYPE node at
 all. 
 Our docs state:
 
 The xml type can store well-formed documents, as defined by the
 XML standard, as well as content fragments, which are defined by
 the production XMLDecl? content in the XML standard.
 
 A document with a leading DOCTYPE node matches neither of these
 rules, 
 and when we strip the XMLDecl from a piece of XML where it's followed
 by 
 a DOCTYPE node we turn something that is legal XML into something
 that 
 isn't, even by our own (or possibly the standard's) relaxed
 definition. 
 A doctypedecl can only follow an  XMLDecl, see 
 http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-prolog-dtd.

Our version of SQL/XML support references SQL:2003 which references XML
1.0, where omitting the XMLDecl is legal.  You can't omit the XMLDecl in
XML 1.1, because you need it to communicate the fact that it's version
1.1.

But note that that is correctly supported:

=# select xmlconcat('?xml version=1.0?foo/', '?xml
version=1.0?bar/');
  xmlconcat
--
 foo/bar/

and

=# select xmlconcat('?xml version=1.1?foo/', '?xml
version=1.1?bar/');
 xmlconcat
---
 ?xml version=1.1?foo/bar/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dtester-0.1 released

2010-03-24 Thread Markus Wanner

Steve,

Steve Singer wrote:

$ git clone http://git.postgres-r.org/dtester
Initialized empty Git repository in 
/local/home/ssinger/src/dtester/dtester/.git/
fatal: http://git.postgres-r.org/dtester/info/refs download error - The 
requested URL returned error: 500


Oh, thank you for pointing this out. I've fixed that for now.

Do I need to run git update-server-info after every pull for the http 
protocol to work? Or just once to initialize the repository?


Note that the git protocol said to be is more efficient. At least that's 
what I've heard. You might want to use that instead (especially if http 
continues to pose problems).


Kind Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dtester-0.1 released

2010-03-24 Thread Jan Urbański

On 24/03/10 21:06, Markus Wanner wrote:

Steve,

Steve Singer wrote:

$ git clone http://git.postgres-r.org/dtester
Initialized empty Git repository in
/local/home/ssinger/src/dtester/dtester/.git/
fatal: http://git.postgres-r.org/dtester/info/refs download error -
The requested URL returned error: 500


Oh, thank you for pointing this out. I've fixed that for now.

Do I need to run git update-server-info after every pull for the http
protocol to work? Or just once to initialize the repository?


After each change in the repository. If you are pushing into that 
repository, you can set the post-update hook in the repo to run git 
update-server-info and forget about it.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: xmlconcat (was [HACKERS] 9.0 release notes done)

2010-03-24 Thread Andrew Dunstan



Peter Eisentraut wrote:

Our version of SQL/XML support references SQL:2003 which references XML
1.0, where omitting the XMLDecl is legal.  You can't omit the XMLDecl in
XML 1.1, because you need it to communicate the fact that it's version
1.1.


  


Hmm. OK. Well here is a patch that tries to fix the xmlconcat error, 
anyway. It seems to work, but maybe could stand a little tightening.


cheers

andrew
Index: src/backend/utils/adt/xml.c
===
RCS file: /cvsroot/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.97
diff -c -r1.97 xml.c
*** src/backend/utils/adt/xml.c	3 Mar 2010 17:29:45 -	1.97
--- src/backend/utils/adt/xml.c	24 Mar 2010 22:05:19 -
***
*** 418,424 
--- 418,466 
  #endif
  }
  
+ #ifdef USE_LIBXML
+ static inline void
+ strip_dtd(char ** xmlstr)
+ {
+ 
+ 	xmlDocPtr	doc;
+ 	xmlChar *xmlbuff;
+ 	int buffersize;
+ 	booldtd_found = false;
+ 	xmlNodePtr child;
+ 	char * skip_xmldecl;
+ 
+ 	if (strstr(*xmlstr,!DOCTYPE) == NULL)
+ 		return ;
+ 
+ 	doc = xml_parse(cstring_to_text(*xmlstr), XMLOPTION_DOCUMENT, true, GetDatabaseEncoding());
+ 
+ 	for (child = doc-children; child != NULL; child = child-next)
+ 	{
+ 		if (child-type == XML_DOCUMENT_TYPE_NODE || 
+ 			child-type == XML_DTD_NODE)
+ 		{
+ 			xmlUnlinkNode(child);
+ 			xmlFreeNode(child);
+ 			dtd_found = true;
+ 		}
+ 	}
+ 	if (dtd_found)
+ 	{
+ 		pfree(*xmlstr);
+ 		xmlDocDumpFormatMemory(doc, xmlbuff, buffersize, 0);
+ 		if (strncmp((char *)xmlbuff,?xml,5) == 0)
+ 			skip_xmldecl = strstr((char *)xmlbuff,?\n) + 3;
+ 		else
+ 			skip_xmldecl = (char *) xmlbuff;
+ 		*xmlstr = pstrdup(skip_xmldecl);
+ 		xmlFree(xmlbuff);
+ 		
+ 	}
+ 	xmlFreeDoc(doc);
  
+ }
+ #endif
  
  /*
   * TODO: xmlconcat needs to merge the notations and unparsed entities
***
*** 460,465 
--- 502,509 
  		else if (xmlStrcmp(version, global_version) != 0)
  			global_version_no_value = true;
  
+ 		strip_dtd(str);
+ 
  		appendStringInfoString(buf, str + len);
  		pfree(str);
  	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: xmlconcat (was [HACKERS] 9.0 release notes done)

2010-03-24 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Hmm. OK. Well here is a patch that tries to fix the xmlconcat error, 
 anyway. It seems to work, but maybe could stand a little tightening.

I liked your previous idea (rethink the whole mess in 9.1) better.

As far as the patch itself is concerned, the complete lack of error
checks seems scary, and I wonder whether the case sensitivity and
lack of whitespace tolerance in the string comparisons is OK.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: xmlconcat (was [HACKERS] 9.0 release notes done)

2010-03-24 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:
  
Hmm. OK. Well here is a patch that tries to fix the xmlconcat error, 
anyway. It seems to work, but maybe could stand a little tightening.



I liked your previous idea (rethink the whole mess in 9.1) better.

As far as the patch itself is concerned, the complete lack of error
checks seems scary, 


Yes, this wasn't intended as the final patch. If it's not wanted right 
now, that's fine too. I just wanted to get it on the record as possibly 
something useful when we do come to reconsider the whole mess. Getting 
to grips with the libxml2 API is no fun, and it's better not to have to 
repeat it if possible ;-)



and I wonder whether the case sensitivity and
lack of whitespace tolerance in the string comparisons is OK.


  


The tokens were chosen with some care to be such that no whitespace 
tolerance would be needed (or correct). XML is case sensitive, so that's 
not an issue either.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-24 Thread Simon Riggs
On Wed, 2010-03-24 at 14:31 +0200, Heikki Linnakangas wrote:
 Fujii Masao wrote:
  But in the current (v8.4 or before) behavior, recovery ends normally
  when an invalid record is found in an archived WAL file. Otherwise,
  the server would never be able to start normal processing when there
  is a corrupted archived file for some reasons. So, that invalid record
  should not be treated as a PANIC if the server is not in standby mode
  or the trigger file has been created. Thought?
 
 Hmm, true, this changes behavior over previous releases. I tend to think
 that it's always an error if there's a corrupt file in the archive,
 though, and PANIC is appropriate. If the administrator wants to start up
 the database anyway, he can remove the corrupt file from the archive and
 place it directly in pg_xlog instead.

I don't agree with changing the behaviour from previous releases.

PANICing won't change the situation, so it just destroys server
availability. If we had 1 master and 42 slaves then this behaviour would
take down almost the whole server farm at once. Very uncool.

You might have reason to prevent the server starting up at that point,
when in standby mode, but that is not a reason to PANIC. We don't really
want all of the standbys thinking they can be the master all at once
either. Better to throw a serious ERROR and have the server still up and
available for reads.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-24 Thread Fujii Masao
On Thu, Mar 25, 2010 at 8:23 AM, Simon Riggs si...@2ndquadrant.com wrote:
 PANICing won't change the situation, so it just destroys server
 availability. If we had 1 master and 42 slaves then this behaviour would
 take down almost the whole server farm at once. Very uncool.

 You might have reason to prevent the server starting up at that point,
 when in standby mode, but that is not a reason to PANIC. We don't really
 want all of the standbys thinking they can be the master all at once
 either. Better to throw a serious ERROR and have the server still up and
 available for reads.

OK. How about making the startup process emit WARNING, stop WAL replay and
wait for the presence of trigger file, when an invalid record is found?
Which keeps the server up for readonly queries. And if the trigger file is
found, I think that the startup process should emit a FATAL, i.e., the
server should exit immediately, to prevent the server from becoming the
primary in a half-finished state. Also to allow such a halfway failover,
we should provide fast failover mode as pg_standby does?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-24 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 OK. How about making the startup process emit WARNING, stop WAL replay and
 wait for the presence of trigger file, when an invalid record is found?
 Which keeps the server up for readonly queries. And if the trigger file is
 found, I think that the startup process should emit a FATAL, i.e., the
 server should exit immediately, to prevent the server from becoming the
 primary in a half-finished state. Also to allow such a halfway failover,
 we should provide fast failover mode as pg_standby does?

I find it extremely scary to read this sort of blue-sky design
discussion going on now, two months after we were supposedly
feature-frozen for 9.0.  We need to be looking for the *rock bottom
minimum* amount of work to do to get 9.0 out the door in a usable
state; not what would be nice to have later on.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers