Re: [HACKERS] Immediate shutdown and system(3)

2009-03-02 Thread Heikki Linnakangas

Fujii Masao wrote:

On Fri, Feb 27, 2009 at 6:52 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:

I'm leaning towards option 3, but I wonder if anyone sees a better solution.


4. Use the shared memory to tell the startup process about the shutdown state.
When a shutdown signal arrives, postmaster sets the corresponding shutdown
state to the shared memory before signaling to the child processes. The startup
process check the shutdown state whenever executing system(), and determine
how to exit according to that state. This solution doesn't change any existing
behavior of pg_standby. What is your opinion?


That would only solve the problem for pg_standby. Other programs you 
might use as a restore_command or archive_command like cp or rsync 
would still core dump on the SIGQUIT.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath processing brain dead

2009-03-02 Thread Peter Eisentraut

Andrew Dunstan wrote:
can you point me at any call in libxml2 which will evaluate an xpath 
expression in the context of a nodeset instead of a document? Quite 
apart from anything else, xpath requires there to be a (single) context 
node (see http://www.w3.org/TR/xpath20/#context ). For a doc, we set 
that node to the document node, but what would it be for a node-set or a 
fragment? If we can't get over that hurdle we're screwed in pursuing 
your line of thought.


Which may hint at the fact that running xpath on content fragments is 
ill-defined to begin with?!?



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a proposal for an extendable deparser

2009-03-02 Thread Heikki Linnakangas

Dave Gudeman wrote:

I don't need to add new node types or add any syntax; it is the output that
I'm concerned with. What I want is a way to print a tree according to some
pretty strict rules. For example, I want a special syntax for function RTEs
and I don't want the v::type notation to be output (the flag to turn it off
doesn't do what I want).


This will become useful for SQL/MED connectors to other databases. Other 
DBMSs have slightly different syntax, and with something like this you 
could still use ruleutils.c for the deparsing, but tweak it slightly for 
the target database.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath processing brain dead

2009-03-02 Thread Simon Riggs

On Sun, 2009-03-01 at 18:22 -0500, Andrew Dunstan wrote:

 I think the XML type needs to conform to the SQL/XML spec. However, we
 are trying to apply XPath, which has a different data model, to that 
 type - hence the impedance mismatch.
 
 I think that the best we can do (for 8.4, having fixed 8.3 as best we 
 can without adversely changing behaviour) is to throw the
 responsibility 
 for ensuring that the XML passed to the function is an XML document
 back on the programmer. Anything else, especially any mangling of the
 XPath 
 expression, presents a very real danger of breaking on correct input.

Can we provide a single function to bridge the gap between fragment and
document? It will be clearer to do this than to see various forms of
appending/munging, even if that function is a simple wrapper around an
append.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath processing brain dead

2009-03-02 Thread Andrew Dunstan



Peter Eisentraut wrote:

Andrew Dunstan wrote:
can you point me at any call in libxml2 which will evaluate an xpath 
expression in the context of a nodeset instead of a document? Quite 
apart from anything else, xpath requires there to be a (single) 
context node (see http://www.w3.org/TR/xpath20/#context ). For a doc, 
we set that node to the document node, but what would it be for a 
node-set or a fragment? If we can't get over that hurdle we're 
screwed in pursuing your line of thought.


Which may hint at the fact that running xpath on content fragments is 
ill-defined to begin with?!?




Right. But that's no excuse for what we have been doing, which was 
demonstrably providing false results on good input.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath processing brain dead

2009-03-02 Thread Andrew Dunstan



Simon Riggs wrote:

On Sun, 2009-03-01 at 18:22 -0500, Andrew Dunstan wrote:

  

I think the XML type needs to conform to the SQL/XML spec. However, we
are trying to apply XPath, which has a different data model, to that 
type - hence the impedance mismatch.


I think that the best we can do (for 8.4, having fixed 8.3 as best we 
can without adversely changing behaviour) is to throw the
responsibility 
for ensuring that the XML passed to the function is an XML document

back on the programmer. Anything else, especially any mangling of the
XPath 
expression, presents a very real danger of breaking on correct input.



Can we provide a single function to bridge the gap between fragment and
document? It will be clearer to do this than to see various forms of
appending/munging, even if that function is a simple wrapper around an
append.

  


I have no objection to providing an *extra* function that explicitly 
wraps non-documents and prefixes the xpath expression in that case, and 
is documented to have limitations. But I don't think we can provide a 
single function that always does the right thing, especially when that 
is so ill-defined in the case of fragments.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath processing brain dead

2009-03-02 Thread Hannu Krosing
On Mon, 2009-03-02 at 07:54 -0500, Andrew Dunstan wrote:
 
 Simon Riggs wrote:
  On Sun, 2009-03-01 at 18:22 -0500, Andrew Dunstan wrote:
 

  I think the XML type needs to conform to the SQL/XML spec. However, we
  are trying to apply XPath, which has a different data model, to that 
  type - hence the impedance mismatch.
 
  I think that the best we can do (for 8.4, having fixed 8.3 as best we 
  can without adversely changing behaviour) is to throw the
  responsibility 
  for ensuring that the XML passed to the function is an XML document
  back on the programmer. Anything else, especially any mangling of the
  XPath 
  expression, presents a very real danger of breaking on correct input.
  
 
  Can we provide a single function to bridge the gap between fragment and
  document? It will be clearer to do this than to see various forms of
  appending/munging, even if that function is a simple wrapper around an
  append.
 

 
 I have no objection to providing an *extra* function that explicitly 
 wraps non-documents and prefixes the xpath expression in that case, and 
 is documented to have limitations. But I don't think we can provide a 
 single function that always does the right thing, especially when that 
 is so ill-defined in the case of fragments.

Is it just that in you _can't_ use Xpath on fragments, and you _need_ to
pass full documents to Xpath ? 

At least this is my reading of Xpath standard.

 cheers
 
 andrew
-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath processing brain dead

2009-03-02 Thread Peter Eisentraut

Hannu Krosing wrote:

Is it just that in you _can't_ use Xpath on fragments, and you _need_ to
pass full documents to Xpath ? 


At least this is my reading of Xpath standard.


It is easy to read the XPath standard that way, because the concept of 
fragments is not defined outside of SQL/XML, and is therefore unknown to 
the XPath standard.  The question at hand is rather whether we can 
usefully adapt it.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath processing brain dead

2009-03-02 Thread Andrew Dunstan



Hannu Krosing wrote:

Is it just that in you _can't_ use Xpath on fragments, and you _need_ to
pass full documents to Xpath ? 


At least this is my reading of Xpath standard.

  


I think that's possibly overstating it., unless I have missed something 
(W3 standards are sometimes not much more clear than the SQL standards ;-( )


For instance, there's this, that implies at least that the tree might 
not be a document:


   A / at the beginning of a path expression is an abbreviation for
   the initial step fn:root(self::node()) treat as document-node()/
   (however, if the / is the entire path expression, the trailing /
   is omitted from the expansion.) The effect of this initial step is
   to begin the path at the root node of the tree that contains the
   context node. If the context item is not a node, a type error is
   raised [err:XPTY0020]. At evaluation time, if the root node above
   the context node is not a document node, a dynamic error is raised
   [err:XPDY0050].

The problem is that we certainly do have to provide a context node (the 
standard is clear about that), and unless we want to convert a 
non-document to a node-set as James suggested and then apply the xpath 
expression to each node in the node-set, we have no way of sanely 
specifying the context node.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath processing brain dead

2009-03-02 Thread Hannu Krosing
On Mon, 2009-03-02 at 15:25 +0200, Peter Eisentraut wrote:
 Hannu Krosing wrote:
  Is it just that in you _can't_ use Xpath on fragments, and you _need_ to
  pass full documents to Xpath ? 
  
  At least this is my reading of Xpath standard.
 
 It is easy to read the XPath standard that way, because the concept of 
 fragments is not defined outside of SQL/XML, and is therefore unknown to 
 the XPath standard. 

How is the opposite - Does SQL/XML specify Xpath usage for XML(SEQUENCE)
and XML(CONTENT) ?

  The question at hand is rather whether we can 
 usefully adapt it.

This sounds like trying to adapt integer arithmetic to
lists-of-integers.

Even for simple things like addition, there are several ways of doing it

[1,2,3] + [1,1,1] = [1,2,3,1,1,1]
[1,2,3] + [1,1,1] = [2,3,4]
[1,2,3] + [1,1,1] = [[1,2,3],[1,1,1]]

all seem possible and logical


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Redefine _() to dgettext() instead of gettext() so that it uses

2009-03-02 Thread Hiroshi Saito

Hi Peter-san.

I see the problem for being an original domain in plpgsql. It differs from what 
codeset meant at postmaster by Japanese windows
Please see, this look at the problem on which SJIS enters into a message. 
http://winpg.jp/~saito/pg_work/LC_MESSAGE_CHECK/plpgsql/before_plpgsql_server.log
This state is the following.  
==

lc_messages=ja
server_encoding=utf-8
==

Therefore,  it needs to be codeset called for an original domain. It is the 
procedure in which
only a server module must correspond. Then, It is solvable by this patch. 
http://winpg.jp/~saito/pg_work/LC_MESSAGE_CHECK/plpgsql/after_plpgsql_server.log


Please take this into consideration. 
Tahnks.


Regards,
Hiroshi Saito

- Original Message - 
From: Peter Eisentraut pete...@gmx.net




Alvaro Herrera wrote:

Peter Eisentraut wrote:

Log Message:
---
Redefine _() to dgettext() instead of gettext() so that it uses the plpgsql
text domain, instead of the postgres one (or whatever the default may be).


Hmm, so is this needed on all other PLs too?


In principle yes.  Or call dgettext() explicitly, which is also done in 
some cases.  However, in most cases messages are issued through 
ereport(), which handles this automatically (which you implemented, I 
recall).

*** src/backend/utils/mb/mbutils.c.orig Sun Mar  1 01:46:59 2009
--- src/backend/utils/mb/mbutils.c  Sun Mar  1 01:46:48 2009
***
*** 883,888 
--- 883,905 
 #endif /* WIN32 */
 
 void

+ SetDomainCodeSet(const char *domainname)
+ {
+ #if defined(ENABLE_NLS)  defined(WIN32)
+   int i;
+   for (i = 0; i  sizeof(codeset_map_array) / 
sizeof(codeset_map_array[0]); i++)
+   {
+   if (codeset_map_array[i].encoding == GetDatabaseEncoding())
+   {
+   if (bind_textdomain_codeset(domainname, 
codeset_map_array[i].codeset) == NULL)
+   elog(LOG, bind_textdomain_codeset failed);
+   break;
+   }
+   }
+ #endif
+ }
+ 
+ void

 SetDatabaseEncoding(int encoding)
 {
if (!PG_VALID_BE_ENCODING(encoding))
*** src/include/mb/pg_wchar.h.orig  Sun Mar  1 01:49:00 2009
--- src/include/mb/pg_wchar.h   Sun Mar  1 01:49:48 2009
***
*** 389,394 
--- 389,395 
 extern int pg_get_client_encoding(void);
 extern const char *pg_get_client_encoding_name(void);
 
+ extern void SetDomainCodeSet(const char *domainname);

 extern void SetDatabaseEncoding(int encoding);
 extern int GetDatabaseEncoding(void);
 extern const char *GetDatabaseEncodingName(void);
*** src/pl/plpgsql/src/pl_handler.c.origSun Mar  1 01:55:34 2009
--- src/pl/plpgsql/src/pl_handler.c Sun Mar  1 01:57:58 2009
***
*** 22,27 
--- 22,28 
 #include utils/guc.h
 #include utils/lsyscache.h
 #include utils/syscache.h
+ #include mb/pg_wchar.h
 
 PG_MODULE_MAGIC;
 
***

*** 43,48 
--- 44,51 
return;
 
 	pg_bindtextdomain(TEXTDOMAIN);

+   /* domain codeset */
+   SetDomainCodeSet(TEXTDOMAIN);
 
 	plpgsql_HashTableInit();

RegisterXactCallback(plpgsql_xact_cb, NULL);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regression test crashes at tsearch

2009-03-02 Thread Teodor Sigaev

Um, I think your patch like the overkill reaction of C-locale...

Patch makes char2wchar and wchar2char symmetric to C-locale.


However, I tried your patch.
make check MULTIBYTE=euc_jp NO_LOCALE=true
...
===
All 120 tests passed.
===

Anyway, either should be applied. Thanks.

Thank you. Changes are committed up to 8.2
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a proposal for an extendable deparser

2009-03-02 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Dave Gudeman wrote:
 I don't need to add new node types or add any syntax; it is the output that
 I'm concerned with. What I want is a way to print a tree according to some
 pretty strict rules. For example, I want a special syntax for function RTEs
 and I don't want the v::type notation to be output (the flag to turn it off
 doesn't do what I want).

 This will become useful for SQL/MED connectors to other databases. Other 
 DBMSs have slightly different syntax, and with something like this you 
 could still use ruleutils.c for the deparsing, but tweak it slightly for 
 the target database.

That all sounds like pie in the sky to me.  It's unlikely that you could
produce any specified syntax with just minor changes to the dumping of a
node type or two --- the node structure is specific to Postgres' view of
the world and won't necessarily be amenable to producing someone else's
syntax.

On the whole, copy and paste ruleutils seems like a sufficient answer
to me.  Maybe when we have a couple of examples of people having to do
that, we can figure out an abstraction that solves the problem better;
but I have no confidence that the mechanism Dave proposes will help
or will be worth the trouble to implement.

An even more likely answer is patch ruleutils so it has an extra flag
that does what you want.  We might or might not be willing to take such
a patch back into core, but it sure seems like a lot less work.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regression test crashes at tsearch

2009-03-02 Thread Greg Stark
On Wed, Feb 25, 2009 at 6:44 PM, Teodor Sigaev teo...@sigaev.ru wrote:
 mbstowcs/wcstombs doesn't work with C-locale in other OSes too, so that's
 not needed.


Say what? What OSes is that?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal: psql - breaking rows in white chars for wrapped format

2009-03-02 Thread Pavel Stehule
Hello

Current wrapped format doesn't break rows well (after white chars). I
propose change this behave (to more typical for text):

Current:

postgres=# \pset format wrappedOutput format is wrapped.
postgres=# select a from test;
   a

 There are many variations of passages of Lorem Ipsum a
 vailable, but the majority have suffered alteration in
  some form, by injected humour, or randomised words wh
 ich don't look even slightly believable. If you are go
 ing to use a passage of Lorem Ipsum, you need to be su
 re there isn't anything embarrassing hidden in the mid
 dle of text. All the Lorem Ipsum generators on the Int
 ernet tend to repeat predefined chunks as necessary, m
 aking this the first true generator on the Internet. I
 t uses a dictionary of over 200 Latin words, combined
 with a handful of model sentence structures, to genera
 te Lorem Ipsum which looks reasonable. The generated L
 orem Ipsum is therefore always free from repetition, i
 njected humour, or non-characteristic words etc.
(1 row)

Proposal:
postgres=# \pset format wrapped
Output format is wrapped.
postgres=# select a from test;
   a

 There are many variations of passages of Lorem Ipsum
 available, but the majority have suffered alteration
 in some form, by injected humour, or randomised words
 which don't look even slightly believable. If you are
 going to use a passage of Lorem Ipsum, you need to be
 sure there isn't anything embarrassing hidden in the
 middle of text. All the Lorem Ipsum generators on the
 Internet tend to repeat predefined chunks as
 necessary, making this the first true generator on the
 Internet. It uses a dictionary of over 200 Latin
 words, combined with a handful of model sentence
 structures, to generate Lorem Ipsum which looks
 reasonable. The generated Lorem Ipsum is therefore
 always free from repetition, injected humour, or
 non-characteristic words etc.
(1 row)

It should be implemented via checking white chars inside
strlen_max_width function.

any notices, ideas welcome

regards
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: psql - breaking rows in white chars for wrapped format

2009-03-02 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 Current wrapped format doesn't break rows well (after white chars). I
 propose change this behave (to more typical for text):

This was suggested and rejected already.  Please read the earlier thread.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: psql - breaking rows in white chars for wrapped format

2009-03-02 Thread Pavel Stehule
2009/3/2 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 Current wrapped format doesn't break rows well (after white chars). I
 propose change this behave (to more typical for text):

 This was suggested and rejected already.  Please read the earlier thread.

                        regards, tom lane


ok,

regards
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GIN, partial matches, lossy bitmaps

2009-03-02 Thread Heikki Linnakangas
While reading the GIN code, I just rediscovered that the GIN partial 
match support suffers from the same problem that I criticized the fast 
insert patch about, and searching the archives I found that I already 
complained about that back in April:


http://archives.postgresql.org/pgsql-patches/2008-04/msg00157.php

If I'm reading the code correctly, item pointers of all matching heap 
tuples are first collected into a TIDBitmap, and then amgetnext returns 
tuples from that one by one. If the bitmap becomes lossy, an error is 
thrown. gingetbitmap is a dummy implementation: it creates a new 
TIDBitmap and inserts all the tuples from the other TIDBitmap into it 
one by one, and then returns the new TIDBitmap.


If we remove the support for regular, non-bitmap, index scans with GIN, 
that could be cleaned up as well. Even if we don't do that, gingetbitmap 
should not error when the bitmap becomes lossy, but just return the 
lossy bitmap.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regression test crashes at tsearch

2009-03-02 Thread Teodor Sigaev


Say what? What OSes is that?

See attached test program. It tries to convert multibyte russian word in 
UTF8 to wide char with C, ru_RU-KOI8-R and ru_RU.UTF-8 locales. The word 
contains 6 letters.


FreeBSD 7.2 (short output):
C==
mbstowcs returns 12
ru_RU.KOI8-R==
mbstowcs returns 12
ru_RU.UTF-8==
mbstowcs returns 6

Linux 2.6.23 libc 2.5 (short output):
C==
mbstowcs returns -1
ru_RU.KOI8-R==
mbstowcs returns 12
ru_RU.UTF-8==
mbstowcs returns 6


The program also prints test of iswalpha:
Linux 2.6.23 libc 2.5 (full output):
C==
mbstowcs returns -1
ERROR
ru_RU.KOI8-R==
mbstowcs returns 12
0-th chacter is alpha
1-th chacter is NOT alpha
2-th chacter is alpha
3-th chacter is NOT alpha
4-th chacter is alpha
5-th chacter is NOT alpha
6-th chacter is alpha
7-th chacter is NOT alpha
8-th chacter is alpha
9-th chacter is NOT alpha
10-th chacter is alpha
11-th chacter is NOT alpha
ru_RU.UTF-8==
mbstowcs returns 6
0-th chacter is alpha
1-th chacter is alpha
2-th chacter is alpha
3-th chacter is alpha
4-th chacter is alpha
5-th chacter is alpha


t.c.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4680: Server crashed if using wrong (mismatch) conversion functions

2009-03-02 Thread Tom Lane
I wrote:
 In any case, that's orthogonal to the part that I was focusing on,
 which was to try to prevent error recursion as a result of trouble
 in the encoding conversion subsystem.  It looks like we could do that
 with some additional hacking in send_message_to_frontend() to avoid
 conversion, as well as translation, when in_error_recursion_trouble()
 is true.  Your point about there possibly being non-ASCII user-inserted
 data in the message is a bit troubling, but for the cases where
 recursion is actually occurring I don't think that that will happen.

Here is a proposed patch that does this.  It largely reverts my patch
of 2008-10-27 in favor of a more general policy that says that *all*
localization of error messages is disabled once we get into error
recursion trouble.  Having done that, we can reasonably assume that
the error message text is 7-bit ASCII, and therefore bypass encoding
conversion as well.  This fixes the example reported in bug #4680
(even without the subsequent patch to prevent that case from arising),
and it still prevents the cases that my previous patch was meant to
deal with.

Comments, objections?

regards, tom lane

Index: src/backend/libpq/pqformat.c
===
RCS file: /cvsroot/pgsql/src/backend/libpq/pqformat.c,v
retrieving revision 1.48
diff -c -r1.48 pqformat.c
*** src/backend/libpq/pqformat.c1 Jan 2009 17:23:42 -   1.48
--- src/backend/libpq/pqformat.c2 Mar 2009 19:13:12 -
***
*** 41,46 
--- 41,47 
   *pq_sendcountedtext - append a counted text string (with 
character set conversion)
   *pq_sendtext - append a text string (with conversion)
   *pq_sendstring   - append a null-terminated text string (with 
conversion)
+  *pq_send_raw_string - append a null-terminated text string 
(without conversion)
   *pq_endmessage   - send the completed message to the frontend
   * Note: it is also possible to append data to the StringInfo buffer using
   * the regular StringInfo routines, but this is discouraged since required
***
*** 184,190 
  pq_sendstring(StringInfo buf, const char *str)
  {
int slen = strlen(str);
- 
char   *p;
  
p = pg_server_to_client(str, slen);
--- 185,190 
***
*** 199,204 
--- 199,221 
  }
  
  /* 
+  *pq_send_raw_string  - append a null-terminated text string 
(without conversion)
+  *
+  * This function intentionally bypasses encoding conversion; it should be used
+  * only in very specialized cases, preferable only when the given string is
+  * known to be just 7-bit ASCII.
+  *
+  * NB: passed text string must be null-terminated, and so is the data
+  * sent to the frontend.
+  * 
+  */
+ void
+ pq_send_raw_string(StringInfo buf, const char *str)
+ {
+   appendBinaryStringInfo(buf, str, strlen(str) + 1);
+ }
+ 
+ /* 
   *pq_sendint  - append a binary integer to a 
StringInfo buffer
   * 
   */
Index: src/backend/utils/error/elog.c
===
RCS file: /cvsroot/pgsql/src/backend/utils/error/elog.c,v
retrieving revision 1.212
diff -c -r1.212 elog.c
*** src/backend/utils/error/elog.c  19 Jan 2009 15:34:23 -  1.212
--- src/backend/utils/error/elog.c  2 Mar 2009 19:13:13 -
***
*** 72,77 
--- 72,80 
  #include utils/ps_status.h
  
  
+ #undef _
+ #define _(x) err_gettext(x)
+ 
  /* Global variables */
  ErrorContextCallback *error_context_stack = NULL;
  
***
*** 165,170 
--- 168,192 
  }
  
  /*
+  * One of those fallback steps is to stop trying to localize the error
+  * message, since there's a significant probability that that's exactly
+  * what's causing the recursion.
+  */
+ static inline const char *
+ err_gettext(const char *str)
+ {
+ #ifdef ENABLE_NLS
+   if (in_error_recursion_trouble())
+   return str;
+   else
+   return gettext(str);
+ #else
+   return str;
+ #endif
+ }
+ 
+ 
+ /*
   * errstart --- begin an error-reporting cycle
   *
   * Create a stack entry and store the given parameters in it.  Subsequently,
***
*** 631,637 
char   *fmtbuf; \
StringInfoData  buf; \
/* Internationalize the error format string */ \
!   if (translateit) \
fmt = dgettext(edata-domain, fmt); \
/* Expand %m in format string */ \
fmtbuf = expand_fmt_string(fmt, edata); \
--- 653,659 
char   *fmtbuf; \
StringInfoData  buf; \
/* Internationalize the error 

Re: [HACKERS] regression test crashes at tsearch

2009-03-02 Thread Greg Stark
Hmm well the KOI8 tests unsurprisingly produce random results on non- 
KOI8 input. It's pure chance you didn't get EILSEQ.


What errno did you get for the C locale test? On which input character? 
Perhaps it's sihnalljng EILSEQ for every byte 0x80 ? That seems  
broken to me but perhaps not to a glibc pedant out there.


--
Greg


On 2 Mar 2009, at 19:17, Teodor Sigaev teo...@sigaev.ru wrote:


Say what? What OSes is that?
See attached test program. It tries to convert multibyte russian  
word in UTF8 to wide char with C, ru_RU-KOI8-R and ru_RU.UTF-8  
locales. The word contains 6 letters.


FreeBSD 7.2 (short output):
C==
mbstowcs returns 12
ru_RU.KOI8-R==
mbstowcs returns 12
ru_RU.UTF-8==
mbstowcs returns 6

Linux 2.6.23 libc 2.5 (short output):
C==
mbstowcs returns -1
ru_RU.KOI8-R==
mbstowcs returns 12
ru_RU.UTF-8==
mbstowcs returns 6


The program also prints test of iswalpha:
Linux 2.6.23 libc 2.5 (full output):
C==
mbstowcs returns -1
ERROR
ru_RU.KOI8-R==
mbstowcs returns 12
   0-th chacter is alpha
   1-th chacter is NOT alpha
   2-th chacter is alpha
   3-th chacter is NOT alpha
   4-th chacter is alpha
   5-th chacter is NOT alpha
   6-th chacter is alpha
   7-th chacter is NOT alpha
   8-th chacter is alpha
   9-th chacter is NOT alpha
   10-th chacter is alpha
   11-th chacter is NOT alpha
ru_RU.UTF-8==
mbstowcs returns 6
   0-th chacter is alpha
   1-th chacter is alpha
   2-th chacter is alpha
   3-th chacter is alpha
   4-th chacter is alpha
   5-th chacter is alpha
t.c.gz


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] BUG #4680: Server crashed if using wrong (mismatch) conversion functions

2009-03-02 Thread Heikki Linnakangas

Tom Lane wrote:

I wrote:

In any case, that's orthogonal to the part that I was focusing on,
which was to try to prevent error recursion as a result of trouble
in the encoding conversion subsystem.  It looks like we could do that
with some additional hacking in send_message_to_frontend() to avoid
conversion, as well as translation, when in_error_recursion_trouble()
is true.  Your point about there possibly being non-ASCII user-inserted
data in the message is a bit troubling, but for the cases where
recursion is actually occurring I don't think that that will happen.


Here is a proposed patch that does this.  It largely reverts my patch
of 2008-10-27 in favor of a more general policy that says that *all*
localization of error messages is disabled once we get into error
recursion trouble.  Having done that, we can reasonably assume that
the error message text is 7-bit ASCII, and therefore bypass encoding
conversion as well.  This fixes the example reported in bug #4680
(even without the subsequent patch to prevent that case from arising),
and it still prevents the cases that my previous patch was meant to
deal with.

Comments, objections?


Looks ok to me. I'm still a bit uneasy about the assumption that the 
error message is 7-bit ACII. Maybe that's just because I don't fully 
understand all the conditions how we can end up in recursion, so I would 
still put something into pq_send_raw_string to check that the string 
really is 7-bit ASCII. Just in case. Maybe clear all the high bits, or 
replace non-ASCII characters with question marks.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] sanity check on max_fsm_relations

2009-03-02 Thread Robert Treat
I have an app that needs to create about 50 partitions per day. I'm planning 
to boost up max_fsm_relations to about 100,000, so I won't have to worry 
about changing it again until I can upgrade to 8.4 ;-)  According to the 
docs, this should take about 6MB of shmem, which is no big deal, but I'm 
wondering if there might be other performanace implications that I'm not 
aware of?  Anyone ever run with 6 figure fsm relations (not just pages) 
before? 

-- 
Robert Treat
Conjecture: http://www.xzilla.net
Consulting: http://www.omniti.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4680: Server crashed if using wrong (mismatch) conversion functions

2009-03-02 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Looks ok to me. I'm still a bit uneasy about the assumption that the 
 error message is 7-bit ACII. Maybe that's just because I don't fully 
 understand all the conditions how we can end up in recursion, so I would 
 still put something into pq_send_raw_string to check that the string 
 really is 7-bit ASCII. Just in case. Maybe clear all the high bits, or 
 replace non-ASCII characters with question marks.

Throwing an error is not the answer --- we only get here after we've
already failed to do that, repeatedly.  But it's certainly not a path
that requires high performance, so I have no problem with the
replace-non-ascii-with-question-mark idea.  Will make it so.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: psql - breaking rows in white chars for wrapped format

2009-03-02 Thread Pavel Stehule
2009/3/2 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 Current wrapped format doesn't break rows well (after white chars). I
 propose change this behave (to more typical for text):

 This was suggested and rejected already.  Please read the earlier thread.

                        regards, tom lane


Hi,

I found some, but I really don't understand, what this proposal was
rejected. Please, can you specify an main objections?

Thank You
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN, partial matches, lossy bitmaps

2009-03-02 Thread Jeff Davis
On Mon, 2009-03-02 at 21:14 +0200, Heikki Linnakangas wrote:
 If I'm reading the code correctly, item pointers of all matching heap 
 tuples are first collected into a TIDBitmap, and then amgetnext returns 
 tuples from that one by one. If the bitmap becomes lossy, an error is 
 thrown. gingetbitmap is a dummy implementation: it creates a new 
 TIDBitmap and inserts all the tuples from the other TIDBitmap into it 
 one by one, and then returns the new TIDBitmap.

Do you think that might be the cause of the extra startup overhead that
Robert Haas observed for bitmap scans?

 If we remove the support for regular, non-bitmap, index scans with GIN, 
 that could be cleaned up as well. Even if we don't do that, gingetbitmap 
 should not error when the bitmap becomes lossy, but just return the 
 lossy bitmap.

That sounds reasonable to me.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regression test crashes at tsearch

2009-03-02 Thread Teodor Sigaev


Hmm well the KOI8 tests unsurprisingly produce random results on 
non-KOI8 input. It's pure chance you didn't get EILSEQ.

Because KOI8 is not multibyte encoding.

What errno did you get for the C locale test? On which input 
character?Perhaps it's sihnalljng EILSEQ for every byte 0x80 ? That 
seems broken to me but perhaps not to a glibc pedant out there.


Linux
C==
mbstowcs returns -1 errno: 84 Invalid or incomplete multibyte or wide 
character

ru_RU.KOI8-R==
mbstowcs returns 12 errno: 0 Success
ru_RU.UTF-8==
mbstowcs returns 6 errno: 0 Success

FreeBSD
C==
mbstowcs returns 12 errno: 0 Unknown error: 0
ru_RU.KOI8-R==
mbstowcs returns 12 errno: 0 Unknown error: 0
ru_RU.UTF-8==
mbstowcs returns 6 errno: 0 Unknown error: 0

In any case, we could not trust mbstowcs if current locale is not match 
to encoding. And we could not check every pair of locale/encoding but 
mbstowcs with C-locale and multibyte encoding obviously doesn't work.




t-1.c.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] statistics horribly broken for row-wise comparison

2009-03-02 Thread Merlin Moncure
It looks like for row-wise comparison, only the first column is used
for generating the expected row count.  This can lead to bad plans in
some cases.

Test case (takes seconds to minutes hardware depending):

create table range as select v as id, v % 500 as key, now() +
((random() * 1000) || 'days')::interval as ts from
generate_series(1,1000) v;

create index range_idx on range(key, ts);

explain analyze select * from range where (key, ts) = (222, '7/11/2009') and
(key, ts) = (222, '7/12/2009')
order by key, ts;

result (cut down a bit)
Sort  (cost=469723.46..475876.12 rows=2461061 width=16) (actual
time=0.054..0.056 rows=13 loops=1)
   Sort Key: key, ts
   Sort Method:  quicksort  Memory: 25kB

note the row count expected vs. got.  Varying the ts parameters
changes the expected rows, but varying the key does not.  Note for the
test case the returned plan is ok, but obviously the planner will
freak out and drop to seq scan or so other nefarious things
circumstances depending.

I confirmed this on 8.2 and HEAD (a month old or so).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] statistics horribly broken for row-wise comparison

2009-03-02 Thread Merlin Moncure
On Mon, Mar 2, 2009 at 4:43 PM, Merlin Moncure mmonc...@gmail.com wrote:
 It looks like for row-wise comparison, only the first column is used
 for generating the expected row count.  This can lead to bad plans in
 some cases.

 Test case (takes seconds to minutes hardware depending):

 create table range as select v as id, v % 500 as key, now() +
 ((random() * 1000) || 'days')::interval as ts from
 generate_series(1,1000) v;

 create index range_idx on range(key, ts);

 explain analyze select * from range where (key, ts) = (222, '7/11/2009') and
        (key, ts) = (222, '7/12/2009')
        order by key, ts;

 result (cut down a bit)
 Sort  (cost=469723.46..475876.12 rows=2461061 width=16) (actual
 time=0.054..0.056 rows=13 loops=1)
   Sort Key: key, ts
   Sort Method:  quicksort  Memory: 25kB

 note the row count expected vs. got.  Varying the ts parameters
 changes the expected rows, but varying the key does not.  Note for the

oop, thats backwards.  rows expected depends on key (the first column
in the index) only.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add_path optimization

2009-03-02 Thread Kevin Grittner
 Tom Lane t...@sss.pgh.pa.us wrote: 
 After a lot of distractions, I've finished applying the planner
fixes
 that seem necessary in view of your report about poorer planning in
8.4
 than 8.3.  When you have a chance, it would be useful to do a
thorough
 test of CVS HEAD on your data and query mix --- are there any other
 places where we have regressed relative to 8.3?
 
You probably already know this, but on the query referenced earlier in
the thread, a great plan is now generated!  Even when not cached, this
executed in just under seven seconds.  (I chose these values for
testing this query because it had been logged as exceeding 20 seconds
under 8.3.)  Cached, EXPLAIN ANALYZE runs between 225 and 250 ms. 
Running it without EXPLAIN the psql \timing is between 265 and 277 ms.
EXPLAIN gives a \timing averaging 80 ms.
 
I will see what kind of testing I can put together to try to shake out
any remaining issues.
 
-Kevin

 Sort  (cost=609318.69..609324.64 rows=2377 width=136) (actual 
time=6993.549..6994.589 rows=2388 loops=1)
   Sort Key: C.caseNo
   Sort Method:  quicksort  Memory: 692kB
   -  HashAggregate  (cost=609161.63..609185.40 rows=2377 width=136) (actual 
time=6986.256..6989.144 rows=2388 loops=1)
 -  Append  (cost=0.00..609060.61 rows=2377 width=136) (actual 
time=14.514..6977.019 rows=2396 loops=1)
   -  Nested Loop  (cost=0.00..580976.79 rows=2272 width=136) 
(actual time=14.512..6505.233 rows=2315 loops=1)
 -  Nested Loop Left Join  (cost=0.00..371212.96 rows=2272 
width=129) (actual time=14.441..6445.091 rows=2315 loops=1)
   -  Nested Loop  (cost=0.00..358250.51 rows=2221 
width=121) (actual time=14.294..3821.576 rows=2315 loops=1)
 -  Nested Loop Left Join  
(cost=0.00..357628.46 rows=2221 width=115) (actual time=14.259..3809.529 
rows=2315 loops=1)
   -  Nested Loop Left Join  
(cost=0.00..323081.09 rows=2221 width=115) (actual time=6.324..1322.972 
rows=2315 loops=1)
 Filter: (((WPCT.profileName IS 
NOT NULL) OR (((C.caseType)::text = ANY ('{PA,JD}'::text[])) AND (NOT 
C.isConfidential))) AND (((WPCT.profileName)::text  'PUBLIC'::text) 
OR ((C.caseType)::text  'FA'::text) OR ((C.wcisClsCode)::text  
'40501'::text)))
 -  Nested Loop Anti Join  
(cost=0.00..322299.90 rows=2221 width=116) (actual time=6.211..1296.189 
rows=2609 loops=1)
   -  Nested Loop  
(cost=0.00..317628.98 rows=3355 width=116) (actual time=6.109..1191.783 
rows=3719 loops=1)
 Join Filter: 
(P.partyType)::text = ANY ('{JV,CH}'::text[])) AND 
((C.caseType)::text = 'ZZ'::text)) OR ((P.partyType)::text  ALL 
('{JV,CH}'::text[]))) AND (((C.caseType)::text  ALL 
('{CF,CI,CM,CT,FO,TR}'::text[])) OR ((P.partyType)::text = 'DE'::text)) AND 
C.caseType)::text = ANY ('{JA,JC,JG,JM,JO,JV,JI,TP}'::text[])) AND 
((P.partyType)::text = ANY ('{CH,JV}'::text[]))) OR 
(((C.caseType)::text  ALL ('{JA,JC,JG,JM,JO,JV,JI,TP}'::text[])) AND 
((P.partyType)::text  ALL ('{CH,JV}'::text[] AND 
(((P.partyType)::text  ALL ('{PE,PL,JP}'::text[])) OR 
C.filingDate)::date  '2008-11-01'::date) OR ((C.wcisClsCode)::text 
 '30709'::text)) AND (((C.caseType)::text  ALL ('{CV,FA}'::text[])) OR 
((C.wcisClsCode)::text  '30711'::text) OR (NOT (alternative subplans))
 -  Index Scan using 
Party_SearchName on Party P  (cost=0.00..14.38 rows=3073 width=60) 
(actual time=5.918..935.944 rows=4132 loops=1)
   Index Cond: 
(((searchName)::text = 'HILL,J'::text) AND ((searchName)::text  
'HILL,K'::text))
   Filter: ((NOT 
isSeal) AND ((searchName)::text ~~ 'HILL,J%'::text))
 -  Index Scan using 
Case_pkey on Case C  (cost=0.00..11.25 rows=1 width=72) (actual 
time=0.054..0.054 rows=1 loops=4132)
   Index Cond: 
(((C.countyNo)::smallint = (P.countyNo)::smallint) AND 
((C.caseNo)::text = (P.caseNo)::text))
   Filter: 
(C.isExpunge  true)
 SubPlan
   -  Seq Scan on 
CaseHist CHPET  (cost=0.00..5080459.80 rows=84980 width=15) (never executed)
 Filter: 
((eventType)::text = ANY ('{FWBCA,CCTRO}'::text[]))
   -  Index Scan using 
CaseHist_pkey on CaseHist CHPET  (cost=0.00..92.03 rows=1 width=0) 
(actual time=0.776..0.776 rows=0 loops=16)
   

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2009-03-02 Thread Bryce Cutt
Here is the new patch.

Our experiments show no noticeable performance issue when using the
patch for cases where the optimization is not used because the number
of extra statements executed when the optimization is disabled is
insignificant.

We have updated the patch to remove a couple of if statements, but
this is really minor.  The biggest change was to MultiExecHash that
avoids an if check per tuple by duplicating the hashing loop.

To demonstrate the differences, here is an analysis of the code
changes and their impact.

Three cases:
1) One batch hash join - Optimization is disabled.  Extra statements
executed are:
 - One if (hashtable-nbatch  1) in ExecHashJoin (line 356 of nodeHashjoin.c)
 - One if optimization_on in MultiExecHash (line 259 of nodeHash.c)
 - One if optimization_on in MultiExecHash per probe tuple (line 431
of nodeHashjoin.c)
 - One if statement in ExecScanHashBucket per probe tuple (line 1071
of nodeHash.c)

2) Multi-batch hash join with limited skew - Optimization is disabled.
 Extra statements executed are:
 - One if (hashtable-nbatch  1) in ExecHashJoin (line 356 of nodeHashjoin.c)
 - Executes ExecHashJoinDetectSkew method (at line 357 of
nodeHashjoin.c) that reads stats tuple for probe relation attribute
and determines if skew is above cut-off.  In this case, skew is not
above cutoff and no extra memory is used.
 - One if optimization_on in MultiExecHash (line 259 of nodeHash.c)
 - One if optimization_on in MultiExecHash per probe tuple (line 431
of nodeHashjoin.c)
 - One if statement in ExecScanHashBucket per probe tuple (line 1071
of nodeHash.c)

3) Multi-batch hash join with skew - Optimization is enabled. Extra
statements executed are:
 - One if (hashtable-nbatch  1) in ExecHashJoin (line 356 of nodeHashjoin.c)
 - Executes ExecHashJoinDetectSkew method (at line 357 of
nodeHashjoin.c) that reads stats tuple for probe relation attribute
and determines there is skew.  Allocates space for XXX which is 2% of
work_mem.
 - One if optimization_on in MultiExecHash (line 259 of nodeHash.c)
 - In MultiExecHash after each tuple is hashed determines if its join
attribute value matches one of the MCVs.  If it does, it is put in the
MCV structure.  Cost is the hash and search for each build tuple.
 - If all IM buckets end up frozen in the build phase (MultiExecHash)
because they grow larger than the memory allowed for IM buckets then
skew optimization is turned off and the probe phase reverts to Case 2
 - For each probe tuple, determines if its value is a MCV by
performing hash and quick table lookup.  If yes, probes MCV bucket
otherwise does regular hash algorithm as usual.
 - One if statement in ExecScanHashBucket per probe tuple (line 1071
of nodeHash.c)
 - Additional cost is determining if a tuple is a common tuple (both
on build and probe side).  This additional cost is dramatically
outweighed by avoiding disk I/Os (even if they never hit the disk due
to caching).

The if statement on line 440 of nodeHashjoin.c (in ExecHashJoin) has
been rearranged so that in the single batch case short circuit
evaluation requires only the first test in the IF to be checked.

The limited skew check mentioned in Case 2 above is a simple check
in the ExecHashJoinDetectSkew function.

- Bryce Cutt



On Thu, Feb 26, 2009 at 12:16 PM, Bryce Cutt pandas...@gmail.com wrote:
 The patch originally modified the cost function but I removed that
 part before we submitted it to be a bit conservative about our
 proposed changes.  I didn't like that for large plans the statistics
 were retrieved and calculated many times when finding the optimal
 query plan.

 The overhead of the algorithm when the skew optimization is not used
 ends up being roughly a function call and an if statement per tuple.
 It would be easy to remove the function call per tuple.  Dr. Lawrence
 has come up with some changes so that when the optimization is turned
 off, the function call does not happen at all and instead of the if
 statement happening per tuple it is run just once per join.  We have
 to test this a bit more but it should further reduce the overhead.

 Hopefully we will have the new patch ready to go this weekend.

 - Bryce Cutt


 On Thu, Feb 26, 2009 at 7:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki's got a point here: the planner is aware that hashjoin doesn't
 like skewed distributions, and it assigns extra cost accordingly if it
 can determine that the join key is skewed.  (See the bucketsize stuff
 in cost_hashjoin.)  If this patch is accepted we'll want to tweak that
 code.

 Still, that has little to do with the current gating issue, which is
 whether we've convinced ourselves that the patch doesn't cause a
 performance decrease for cases in which it's unable to help.

                        regards, tom lane




histojoin_v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] statistics horribly broken for row-wise comparison

2009-03-02 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 It looks like for row-wise comparison, only the first column is used
 for generating the expected row count.

[ shrug... ]  Short of multi-column statistics, it's hard to see how to
do better.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN, partial matches, lossy bitmaps

2009-03-02 Thread Robert Haas
On Mon, Mar 2, 2009 at 2:14 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 While reading the GIN code, I just rediscovered that the GIN partial match
 support suffers from the same problem that I criticized the fast insert
 patch about, and searching the archives I found that I already complained
 about that back in April:

 http://archives.postgresql.org/pgsql-patches/2008-04/msg00157.php

 If I'm reading the code correctly, item pointers of all matching heap tuples
 are first collected into a TIDBitmap, and then amgetnext returns tuples from
 that one by one. If the bitmap becomes lossy, an error is thrown.
 gingetbitmap is a dummy implementation: it creates a new TIDBitmap and
 inserts all the tuples from the other TIDBitmap into it one by one, and then
 returns the new TIDBitmap.

The latest version of the path no longer does this - instead, it
flushes the pending list to the main index if the bitmap becomes
lossy.  That strikes me as more tolerable than throwing an error, but
I agree with your criticism: I'm not sure why we are insisting on
using a TIDBitmap (which is designed to be lossy at times) in a
situation where we actually can't tolerate lossiness.  In fact, this
was the main point of my original review of this patch.

http://archives.postgresql.org/message-id/603c8f070902101859j91fb78eg7e0228afe8f2f...@mail.gmail.com

 If we remove the support for regular, non-bitmap, index scans with GIN, that
 could be cleaned up as well. Even if we don't do that, gingetbitmap should
 not error when the bitmap becomes lossy, but just return the lossy bitmap.

Make sense to me.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN, partial matches, lossy bitmaps

2009-03-02 Thread Robert Haas
On Mon, Mar 2, 2009 at 3:02 PM, Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2009-03-02 at 21:14 +0200, Heikki Linnakangas wrote:
 If I'm reading the code correctly, item pointers of all matching heap
 tuples are first collected into a TIDBitmap, and then amgetnext returns
 tuples from that one by one. If the bitmap becomes lossy, an error is
 thrown. gingetbitmap is a dummy implementation: it creates a new
 TIDBitmap and inserts all the tuples from the other TIDBitmap into it
 one by one, and then returns the new TIDBitmap.

 Do you think that might be the cause of the extra startup overhead that
 Robert Haas observed for bitmap scans?

I don't think this is the same thing.  My point was that an index scan
wins big time over a bitmap index scan when the index scan doesn't
need to be run to completion - that is, when the query is a semi-join
or an anti-join, or when using LIMIT without ORDER BY.
This is true with or without Teodor's patch, and is the reason why I'm
not sure that removing index scan support is such a great idea.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby, running xacts, subtransactions

2009-03-02 Thread Robert Treat
On Wednesday 25 February 2009 16:43:54 Simon Riggs wrote:
 On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote:
   You raised that as an annoyance previously because it means that
   connection in hot standby mode may be delayed in cases of heavy,
   repeated use of significant numbers of subtransactions.
 
  While most users still don't use explicit subtransactions at all,
  wouldn't this also affect users who use large numbers of stored
  procedures?

 If they regularly use more than 64 levels of nested EXCEPTION clauses
 *and* they start their base backups during heavy usage of those stored
 procedures, then yes.


We have stored procedrues that loop over thousands of records, with 
begin...exception blocks in that loop, so I think we do that. AFAICT there's 
no way to tell if you have it wrong until you fire up the standby (ie. you 
can't tell at the time you make your base backup), right ?

-- 
Robert Treat
Conjecture: http://www.xzilla.net
Consulting: http://www.omniti.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] statistics horribly broken for row-wise comparison

2009-03-02 Thread Merlin Moncure
On Mon, Mar 2, 2009 at 8:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 It looks like for row-wise comparison, only the first column is used
 for generating the expected row count.

 [ shrug... ]  Short of multi-column statistics, it's hard to see how to
 do better.

hm... Why can't you just multiply the range estimates for the fields
together when doing an operation over the key?

For example, in this case if the planner estimates 10% of rows for
key, and 5% of matches for ts, just multiply .1  .05 and get .005
when you fall into the row operation case.  This would give a
reasonably accurate answer...formally correct, even.  All the
information is there, or am I missing something (not knowing all the
inner workings of the planner, I certainly might be)?

IOW, I don't see this as a 'not enough statistics', more of a 'looking
at the statistics wrong for multi-column index range operation'
problem.  Equality works correctly, as it always has.  This is a kind
of a stats loophole introduced when we got the ability to correctly do
these types of operations in 8.2.

There's no workaround that I see to this problem short of disabling seq_scan.

The classic form of this query when looking for only one 'key' my problem case):
select * from range where key = x and ts between a and b;

usually gives a plain index scan, which can be really undesirable.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] statistics horribly broken for row-wise comparison

2009-03-02 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Mon, Mar 2, 2009 at 8:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 It looks like for row-wise comparison, only the first column is used
 for generating the expected row count.
 
 [ shrug... ]  Short of multi-column statistics, it's hard to see how to
 do better.

 hm... Why can't you just multiply the range estimates for the fields
 together when doing an operation over the key?

Because it would be the wrong answer, except in the uncommon case where
the field values are completely independent (at least, I would expect
that to be uncommon when people have multicolumn indexes on them).

In the case at hand, I think you're barking up the wrong tree anyway.
It's much more likely that what we need to do is fix
clauselist_selectivity to recognize range conditions involving
RowCompareExprs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Immediate shutdown and system(3)

2009-03-02 Thread ITAGAKI Takahiro

Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:

 1. Implement a custom version of system(3) using fork+exec that let's us 
 trap SIGQUIT and send e.g SIGTERM or SIGINT to the child instead. It 
 might be a bit tricky to get this right in a portable way; Windows would 
 certainly need a completely separate implementation.

I think the custom system() approach is the most ideal plan for us because
it could open the door for faster recovery; If there were an asynchronous
version of system(), startup process could parallelly execute both
restoring archived wal files and redoing operations in them.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] V4 of PITR performance improvement for 8.4

2009-03-02 Thread Fujii Masao
Hi Suzuki-san,

On Thu, Feb 26, 2009 at 5:03 AM, Koichi Suzuki koichi@gmail.com wrote:
 My reply to Gregory's comment didn't have any objections.   I believe,
 as I posted to Wiki page, latest posted patch is okay and waiting for
 review.

One of your latest patches doesn't match with HEAD, so I updated it.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] V4 of PITR performance improvement for 8.4

2009-03-02 Thread Fujii Masao
On Tue, Mar 3, 2009 at 1:47 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Hi Suzuki-san,

 On Thu, Feb 26, 2009 at 5:03 AM, Koichi Suzuki koichi@gmail.com wrote:
 My reply to Gregory's comment didn't have any objections.   I believe,
 as I posted to Wiki page, latest posted patch is okay and waiting for
 review.

 One of your latest patches doesn't match with HEAD, so I updated it.

Oops! I failed in attaching the patch. This is second try.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


readahead-20090303.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SIGHUP during recovery

2009-03-02 Thread Fujii Masao
Hi,

Currently, the startup process ignores SIGHUP.

The attached patch allows the startup process to re-read config file:
when SIGHUP arrives, the startup process also receives the signal
from postmaster and reload the settings in main redo apply loop.
Obviously, this is useful to change the parameters which the startup
process may use (e.g. log_line_prefix, log_checkpoints).

Any comments welcome.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


sighup_startup_0303.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Why do we keep UnusedLock1 in LWLockId?

2009-03-02 Thread ITAGAKI Takahiro
Hi,

There is UnusedLock1 in LWLockId enumerations in storage/lwlock.h .
|   UnusedLock1,/* FreeSpaceMapLock used to be 
here */

I thought it is for keeping LWLockId same as 8.3 at first,
but we've already split SInvalLock to SInvalReadLock and
SInvalWriteLock. So the IDs are changed anyway.

Are there any reason to keep UnusedLock1 there?
Or can we remove it simpley?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1668)

2009-03-02 Thread KaiGai Kohei

The series of SE-PostgreSQL patches for v8.4 were updated:

[1/5] http://sepgsql.googlecode.com/files/sepgsql-core-8.4devel-r1668.patch
[2/5] http://sepgsql.googlecode.com/files/sepgsql-utils-8.4devel-r1668.patch
[3/5] http://sepgsql.googlecode.com/files/sepgsql-policy-8.4devel-r1668.patch
[4/5] http://sepgsql.googlecode.com/files/sepgsql-docs-8.4devel-r1668.patch
[5/5] http://sepgsql.googlecode.com/files/sepgsql-tests-8.4devel-r1668.patch

- List of updates:
 * It is rebased to the latest CVS HEAD.
 * sepgsqlCheckProcedureInstall() is moved to sepgsql/hooks.c from
   sepgsql/perms.c, like as other sepgsqlCheck() is delopyed on.
 * sepgsqlCheckDatabaseAccess() is moved to pg_database_aclcheck()
   from pg_database_aclmask(), because pg_aclmask() can be invoked
   on ExecGrant_, but SE-PostgreSQL should not intervene existing
   DAC policy.
 * sepgsqlCheckProcedureExecute() is moved to pg_proc_aclcheck()
   in same reason.

These changes are obvious and minor, and rest of implementations
keep unchanged, so don't consider this updates needs to review
whole of patches again, please.

I would like to know the current status of reviewing the patches.
It is welcome, if it is not still 100% completed and partial ones.
And, please tell me, if I missed anything.
Anyway, it is obvious we don't have enough time!

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers