date:20100204

Re: [HACKERS] Largeobject Access Controls (r2460)

2010-02-04 Thread KaiGai Kohei

(2010/02/05 13:53), Takahiro Itagaki wrote:
> 
> KaiGai Kohei  wrote:
> 
>>> default:both contents and metadata
>>> --data-only:same
>>> --schema-only:  neither
>>
>> However, it means only large object performs an exceptional object class
>> that dumps its owner, acl and comment even if --data-only is given.
>> Is it really what you suggested, isn't it?
> 
> I wonder we still need to have both "BLOB ITEM" and "BLOB DATA"
> even if we will take the all-or-nothing behavior. Can we handle
> BLOB's owner, acl, comment and data with one entry kind?

I looked at the corresponding code.

Currently, we have three _LoadBlobs() variations in pg_backup_tar.c,
pg_backup_files.c and pg_backup_custom.c.

In the _tar.c and _files.c case, we can reasonably fetch data contents
of the blob to be restored. All we need to do is to provide an explicit
filename to the tarOpen() function, and a blob is not necessary to be
restored sequentially.
It means pg_restore can restore an arbitrary file when it found a new
unified blob entry.

In the _custom.c case, its _LoadBlobs() is called from _PrintTocData()
when the given TocEntry is "BLOBS", and it tries to load the following
multiple blobs. However, I could not find any restriction that custom
format cannot have multiple "BLOBS" section. In other word, we can
write out multiple sections with a blob for each a new unified blob entry.

Right now, it seems to me it is feasible to implement what you suggested.

The matter is whether we should do it, or not.
At least, it seems to me better than some of exceptional treatments in
pg_dump and pg_restore from the perspective of design.

What is your opinion?

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs patch

2010-02-04 Thread Tom Lane

Takahiro Itagaki  writes:
> * In the regression tests, almost all of them don't have ORDER BY clause.
>   They just work, but we might need ORDER BY to get robust output.
>   What did we do in other regression tests?

We add ORDER BY only when experience shows it's necessary.  The
reasoning is explained in regress.sgml:

You might wonder why we don't order all the regression test queries explicitly
to get rid of this issue once and for all.  The reason is that that would
make the regression tests less useful, not more, since they'd tend
to exercise query plan types that produce ordered results to the
exclusion of those that don't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs patch

2010-02-04 Thread Marko Tiikkaja

On 2010-02-05 07:14 UTC+2, Takahiro Itagaki wrote:
> 
> Marko Tiikkaja  wrote:
> 
>> Here's an updated patch.  Only changes from the previous patch are
>> fixing the above issue and a regression test for it.
> 
> * In the regression tests, almost all of them don't have ORDER BY clause.
>   They just work, but we might need ORDER BY to get robust output.
>   What did we do in other regression tests?

Looking at with.sql, it seems to use ORDER BY when it accesses data from
a table.  But obviously we can't do this if want to test
INSERT/UPDATE/DELETE .. RETURNING at the top level and returning.sql
seems to be relying on the fact that they come out in the same order
every time.

Regards,
Marko Tiikkaja

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs patch

2010-02-04 Thread Takahiro Itagaki


Marko Tiikkaja  wrote:

> Here's an updated patch.  Only changes from the previous patch are
> fixing the above issue and a regression test for it.

A brief report for of the patch:

* The patch has the following error cases, and also have one regression
  test for each case.

 - DML WITH is not allowed in a cursor declaration
 - DML WITH is not allowed in a view definition
 - DML WITH without RETURNING is only allowed inside an unreferenced CTE
 - DML WITH is only allowed at the top level
 - Recursive DML WITH statements are not supported
 ^-- might be better if "DML WITH cannot have the self-reference" or so?

 - Conditional DO INSTEAD rules are not supported in DML WITH statements
 - DO ALSO rules are not supported in DML WITH statements
 - Multi-statement DO INSTEAD rules are not supported in DML WITH statements
 - DO INSTEAD NOTHING rules are not supported in DML WITH statements

* In the regression tests, almost all of them don't have ORDER BY clause.
  They just work, but we might need ORDER BY to get robust output.
  What did we do in other regression tests?

* I feel odd the following paragraph in the docs, but should be checked by
  native English speakers.

*** a/doc/src/sgml/ref/create_rule.sgml
--- b/doc/src/sgml/ref/create_rule.sgml
***
*** 222,227  CREATE [ OR REPLACE ] RULE name AS
--- 222,234 

  

+In an INSERT, UPDATE or
+DELETE query within a WITH clause,
+only unconditional, single-statement INSTEAD rules are
   ^-- and? which comma is the sentence separator?
+implemented.
 ^-- might be "available" rather than "implemented"?
+   


Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Looking for Bill Huang; was [HACKERS] Personal Copyright Notices

2010-02-04 Thread Bruce Momjian

Bruce Momjian wrote:
> Some more _personalized_ copyright noticed have crept into our source
> tree:
> 
>  /src/tutorial/basics.source  Copyright (c) 1994, Andrew Yu, University of 
> California 
>  /contrib/intagg/Makefile Copyright (c) 2001 Digital Music Network by 
> Mark L. Woodward
>  /src/port/rint.c Copyright (c) 1999, repas AEG Automation GmbH
>  /contrib/isn/isn.c   Copyright (c) 2004-2006, Germn Mndez Bravo 
> (Kronuz)
>  /contrib/isn/isn.h   Copyright (c) 2004-2006, Germn Mndez Bravo 
> (Kronuz)
>  /src/backend/utils/mb/Unicode/UCS_to_GB18030.plCopyright 2002 by Bill 
> Huang

I have now dealt with all of these except the last one:

> The UCS_to_GB18030.pl is a Perl script with this at the top:
> 
>   # Copyright 2002 by Bill Huang

Does anyone know how to contact Bill Huang?  Based on this posting:

http://archives.postgresql.org/pgsql-announce/2002-06/msg3.php

he lives in Japan and worked for Red Hat, and a Japan phone number is
listed that I did not try.  I did try emailing him at
bill_huan...@ybb.ne.jp and hu...@redhat.com but both emails returned
failure messages.

The perl file is 100 lines so it is possible to rewrite it if necessary.
This same file was mentioned in 2007 so I think we should just find a
solution to this:

http://archives.postgresql.org/pgsql-hackers/2007-03/msg01446.php

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Largeobject Access Controls (r2460)

2010-02-04 Thread KaiGai Kohei

(2010/02/05 13:53), Takahiro Itagaki wrote:
> 
> KaiGai Kohei  wrote:
> 
>>> default:both contents and metadata
>>> --data-only:same
>>> --schema-only:  neither
>>
>> However, it means only large object performs an exceptional object class
>> that dumps its owner, acl and comment even if --data-only is given.
>> Is it really what you suggested, isn't it?
> 
> I wonder we still need to have both "BLOB ITEM" and "BLOB DATA"
> even if we will take the all-or-nothing behavior. Can we handle
> BLOB's owner, acl, comment and data with one entry kind?

Is it possible to fetch a certain blob from tar/custom archive
when pg_restore found a toc entry of the blob?

Currently, when pg_restore found a "BLOB DATA" or "BLOBS" entry,
it opens the archive and restores all the blob objects sequentially.
It seems to me we also have to rework the custom format

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Largeobject Access Controls (r2460)

2010-02-04 Thread Takahiro Itagaki


KaiGai Kohei  wrote:

> > default:both contents and metadata
> > --data-only:same
> > --schema-only:  neither
> 
> However, it means only large object performs an exceptional object class
> that dumps its owner, acl and comment even if --data-only is given.
> Is it really what you suggested, isn't it?

I wonder we still need to have both "BLOB ITEM" and "BLOB DATA"
even if we will take the all-or-nothing behavior. Can we handle
BLOB's owner, acl, comment and data with one entry kind?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Personal Copyright Notices

2010-02-04 Thread Bruce Momjian

Bruce Momjian wrote:
> 
> The ISN copyright looks like this:
> 
>* Copyright (c) 2004-2006, Germn Mndez Bravo (Kronuz)
>* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
> 
> Interestingly, he mentions himself and PGDG both in the copyright, which
> makese this less of an issue.  Looking at the ISN code, it appears he
> was the creator of the original file:
> 
>   revision 1.1
>   date: 2006/09/09 04:07:52;  author: tgl;  state: Exp;
>   Add contrib/isn module for ISBN/ISSN/EAN13/etc product numbers, and
>   remove the old isbn_issn module which is about to be obsoleted by EAN13.
>   contrib/isn is by Germn Mndez Bravo.  Our thanks to Garrett A.
>   Wollman for having written the original isbn_issn module.
> 
> and he asked about his copyright mention as part of the patch
> discussion:
> 
>   http://archives.postgresql.org/pgsql-hackers/2006-09/msg00794.php
> 
> I never saw the issue actually addressed and the copyright line was committed.
> Do people think it is acceptable?

I received email permission from German Mendez Bravo to just mention him
as the /contrib/isn author, and remove the "copyright" word.  Change
commited.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] remove contrib/xml2

2010-02-04 Thread M Z

I did some tests followed Robert's test cases on both postgresql 8.4.2-0ubu
and 8.3.8-1, OS: Ubuntu Karmic.

1) 1st test case, it doesn't crash on 8.3.8 but crash on 8.4.2;
2) 2nd test case, both 8.3.8 and 8.4.2 are fine, and no warning (different
from Robert's test?);
3) 3rd test case (and modified test case for 8.3.8), both 8.3.8 and 8.4.2
are not correct, same with Robert's test (8.5 beta?);


*
1st test case:
==
8.3.8
==
conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t xml);
CREATE TABLE
conifer=# INSERT INTO xpath_test VALUES (1, '1');
INSERT 0 1
conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int',
'true') as t(id int4);
 id

  1
(1 row)

==
8.4.2
==
conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t xml);
CREATE TABLE
conifer=# INSERT INTO xpath_test VALUES (1, '1');
INSERT 0 1
conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int',
'true') as t(id int4);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>
*

*
2nd test case
==
8.3.8 and 8.4.2
==
conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t text);
CREATE TABLE
conifer=# INSERT INTO xpath_test VALUES (1, '1');
INSERT 0 1
conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int',
'true') as t(id int4);
 id

  1
(1 row)
*

*
3rd test case
==
8.3.8 and 8.4.2
==
conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t text);
CREATE TABLE
conifer=# INSERT INTO xpath_test VALUES (1, '');
INSERT 0 1
conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test',
'/rowlist/row/@a|/rowlist/row/@b', 'true') as t(id int4, a text, b text);
 id | a |  b
+---+--
  1 | 1 | oops
  1 | 2 |
(2 rows)


==
8.3.8 (modified 3rd test case, because 8.3.8 won't crash using xml)
==
conifer=# CREATE TABLE xpath_test (id integer NOT NULL, t xml);
CREATE TABLE
conifer=# INSERT INTO xpath_test VALUES (1, '');
INSERT 0 1
conifer=# SELECT * FROM xpath_table('id', 't', 'xpath_test',
'/rowlist/row/@a|/rowlist/row/@b', 'true') as t(id int4, a text, b text);
 id | a |  b
+---+--
  1 | 1 | oops
  1 | 2 |
(2 rows)
*


For 1st test case, not sure if some paths applied to 8.3 haven't been
applied to 8.4, or other reasons cause the difference between 8.3.8 and
8.4.2.

Any ideas or comments?

Thank you,
M Z

On Mon, Feb 1, 2010 at 8:44 PM, Robert Haas  wrote:

> On Mon, Feb 1, 2010 at 5:23 PM, Andrew Dunstan 
> wrote:
> > Robert Haas wrote:
> >> (2) add a very, very large warning that this will crash if you do
> >> almost anything with it.
> >
> > I think that's an exaggeration. Certain people are known to be using it
> > quite successfully.
>
> Hmm.  Well, all I know is that the first thing I tried crashed the server.
>
> CREATE TABLE xpath_test (id integer NOT NULL, t xml);
> INSERT INTO xpath_test VALUES (1, '1');
> SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true')
> as t(id int4);
>
> It doesn't crash if you change the type of t from xml to text; instead
> you get a warning about some sort of memory allocation problem.
>
> DROP TABLE xpath_test;
> CREATE TABLE xpath_test (id integer NOT NULL, t text);
> INSERT INTO xpath_test VALUES (1, '1');
> SELECT * FROM xpath_table('id', 't', 'xpath_test', '/doc/int', 'true')
> as t(id int4);
>
> yields:
>
> WARNING:  problem in alloc set ExprContext: bogus aset link in block
> 0x14645e0, chunk 0x14648b8
>
> And then there's this (see also bug #5285):
>
> DELETE FROM xpath_test;
> INSERT INTO xpath_test VALUES (1, ' b="oops"/>');
> SELECT * FROM xpath_table('id', 't', 'xpath_test',
> '/rowlist/row/@a|/rowlist/row/@b', 'true') as t(id int4, a text, b
> text);
>
> which yields an answer that is, at least, extremely surprising, if not
> flat-out wrong:
>
>  id | a |  b
> +---+--
>  1 | 1 | oops
>  1 | 2 |
> (2 rows)
>
> Bugs #4953 and #5079 can also be reproduced in CVS HEAD.  Both crash the
> server.
>
> ...Robert
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Personal Copyright Notices

2010-02-04 Thread Bruce Momjian

Bruce Momjian wrote:
> The intagg copyright is on a _Makefile_:
> 
>   # Makefile for integer aggregator
>   # Copyright (C) 2001 Digital Music Network.
>   # by Mark L. Woodward
>   # $PostgreSQL: pgsql/contrib/intagg/Makefile,v 1.10 2008/11/14 19:58:45 
> tgl Exp $
> 
> Seems we either have to contact the author or rewrite the file.

I have rewritten the above file from scratch to enable removal of the
copyright mention.

> ---
> 
> The rint.c file is again tiny:
> 
>  * rint.c
>  *rint() implementation
>  *
>  * Copyright (c) 1999, repas AEG Automation GmbH
> 
> Perhaps that just needs to be rewritten.

I received a new rint() version from Nathan Wagner that is simpler and
replaced the old rint() code with Nathan's and removed the copyright
mention.  Patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/port/rint.c
===
RCS file: /cvsroot/pgsql/src/port/rint.c,v
retrieving revision 1.2
diff -c -c -r1.2 rint.c
*** src/port/rint.c	29 Nov 2003 19:52:13 -	1.2
--- src/port/rint.c	5 Feb 2010 03:20:25 -
***
*** 3,11 
   * rint.c
   *	  rint() implementation
   *
-  * Copyright (c) 1999, repas AEG Automation GmbH
-  *
-  *
   * IDENTIFICATION
   *	  $PostgreSQL: pgsql/src/port/rint.c,v 1.2 2003/11/29 19:52:13 pgsql Exp $
   *
--- 3,8 
***
*** 18,37 
  double
  rint(double x)
  {
! 	double		f,
! n = 0.;
! 
! 	f = modf(x, &n);
! 
! 	if (x > 0.)
! 	{
! 		if (f > .5)
! 			n += 1.;
! 	}
! 	else if (x < 0.)
! 	{
! 		if (f < -.5)
! 			n -= 1.;
! 	}
! 	return n;
  }
--- 15,19 
  double
  rint(double x)
  {
! 	return (x > 0.0) ? floor(x + 0.5) : ceil(x - 0.5);
  }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing cvs HEAD - HS/SR - cannot stat

2010-02-04 Thread Fujii Masao

On Fri, Feb 5, 2010 at 3:58 AM, Heikki Linnakangas
 wrote:
>> LOG:  database system was interrupted while in recovery at log time 
>> 2010-02-04 20:45:40 EET
>> HINT:  If this has occurred more than once some data might be corrupted and 
>> you might need to choose an earlier recovery target.
>
> Can we avoid printing this? It's a bit alarming to talk about corrupted
> data.

I think this is useful information as a hint of data corruption. But it's
odd for this to be reported even when restarting the standby server that
has been previously shut down normally (i.e., in fact shutdown mode).

How about adding new system status like DB_SHUTDOWNED_IN_RECOVERY, setting
the status to it when the shutdown is performed during recovery, and
reporting the suitable message when starting up the server from it?

>> LOG:  starting archive recovery

This is reported even if restore_command is not given and the WAL files are
never restored from the archive. We should get rid of this in that case?

>> LOG:  restore_command = 'cp /home/hlinnaka/pgsql.cvshead/walarchive/%f %p'
>> LOG:  standby_mode = 'true'
>> LOG:  primary_conninfo = 'host=localhost port=5432 user=rep_user 
>> password=reppass'
>> LOG:  trigger_file = '/tmp/standby-trigger'
>
> Do we really need to echo all the lines in recovery.conf? That might be
> interesting information, but perhaps it could be condensed and worded
> more nicely.

It's OK for me to move them from LOG to DEBUG.

>> cp: cannot stat 
>> `/home/hlinnaka/pgsql.cvshead/walarchive/00010007': No such 
>> file or directory
>
> This is the noise line Josh started this thread with.

Agreed. But the messages other than ENOENT that restore_command emits
might be useful.

>> LOG:  automatic recovery in progress
>
> Ok, so what?

Seems unnecessary.

>> LOG:  initializing recovery connections
>
> Seems like unnecessary noise, doesn't mean anything to a DBA.

Agreed.

>> LOG:  redo starts at 0/700140C
>
> I guess this could be useful debug information sometimes.

Agreed.

>> LOG:  consistent recovery state reached at 0/700142C
>
> It's nice to know that it has reached consistency, but was there any way
> to know before this line that it hadn't been reached yet? Perhaps the
> "redo starts" line should mention that consistency hasn't been reached yet.

But redo might restart from the consistent database if the standby
server was shut down after it reached the consistent status.

>> LOG:  database system is ready to accept read only connections
>
> This is an important piece of information with Hot Standby.

Agreed.

> Now, should we print a line when we connect to the master successfully?
> Seems like useful information.

Agreed.

> Then there's the LOG lines whenever a file is restored successfully from
> archive; are the necessary anymore, now that you can connect to the
> standby and use pg_last_xlog_replay_location() to poll its status?

How about moving those messages from LOG to DEBUG?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] building tsquery directly in memory (avoid makepol)

2010-02-04 Thread Ivan Sergio Borgonovo

On Thu, 04 Feb 2010 22:13:02 +0300
Teodor Sigaev  wrote:

> > Before doing it the trial and error way can somebody just make
> > me an example?
> > I'm not pretty sure about my interpretation of the comments of
> > the documentation.
> > TSQuery
> [skipped]
> Right, valcrc is computed in pushValue

Anyway the structure I posted is correct, isn't it?
Is there any equivalent MACRO to POSTDATALEN, WEP_GETWEIGHT and
macro to know the memory size of a TSQuery?
I think I've seen MACRO that could help me to determine the size of
a TSQuery... but I haven't noticed anything like POSTDATALEN that
could come very handy to traverse a TSQuery.

I was thinking to skip pushValue and directly build the TSQuery in
memory since my queries have very simple structure and they are easy
to reduce...
Still it is not immediate to know the memory size in advance.
For OR queries it is easy but for AND queries I'll have to loop over
a tsvector, filter the weight according to a passed parameter and
see how many time I've to duplicate a lexeme for each weight.

eg.

tsvector_to_tsquery(
  'pizza:1A,2B risotto:2C,4D barolo:5A,6C', '&', 'ACD'
);

should be turned into

pizza:A & risotto:C & risotto:D & barolo:A & barolo:C

I noticed you actually loop over the tsvector in tsvectorout to
allocate the memory for the string buffer and I was wondering if it
is really worth for my case as well.

Any good receipt in Moscow? ;)

thanks

-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

2010-02-04 Thread Fujii Masao

On Fri, Feb 5, 2010 at 9:08 AM, Takahiro Itagaki
 wrote:
>> But it was rejected because its change might break the existing app.
>
> It might break existing applications if it returns "FE" instead of "FF",
> but never-used filename surprises users. (IMO, the existing apps probably
> crash if "FF" returned, i.e, 1/256 of the time.)
>
> Should it return the *next* reasonable log filename instead of "FF"?
> For example, 00020020 for the above case.

I wonder if that change also breaks the existing app. But since
I've never seen the app that doesn't use that filename at face
value, I agree to change the existing (odd for me) behavior of
pg_stop_backup().

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #4566: pg_stop_backup() reports incorrect STOP WAL LOCATION

2010-02-04 Thread Takahiro Itagaki

Fujii Masao  wrote:

> On Fri, Dec 5, 2008 at 11:41 PM, Randy Isbell  wrote:
> > An inconsistency exists between the segment name reported by
> > pg_stop_backup() and the actual WAL file name.
> >
> > START WAL LOCATION: 10/FE1E2BAC (file 0002001000FE)
> > STOP WAL LOCATION: 10/FF00 (file 0002001000FF)

> But it was rejected because its change might break the existing app.

It might break existing applications if it returns "FE" instead of "FF",
but never-used filename surprises users. (IMO, the existing apps probably
crash if "FF" returned, i.e, 1/256 of the time.)

Should it return the *next* reasonable log filename instead of "FF"?
For example, 00020020 for the above case.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_class has no toast table?

2010-02-04 Thread Tom Lane

Still fooling with VACUUM FULL on catalogs ... I find that a sanity
check I put in is barfing on "VACUUM FULL pg_class", because the
transient table is built with a toast table, whereas pg_class hasn't got
one.  It seems like it probably ought to have one, because either relacl
or reloptions could in principle be too big to fit without toasting
(which is exactly why AlterTableCreateToastTable thinks it should make
one for the transient table).

I have a vague feeling that we intentionally omitted a toast table for
pg_class, but I don't remember why.  Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Largeobject Access Controls (r2460)

2010-02-04 Thread KaiGai Kohei


(2010/02/05 3:27), Alvaro Herrera wrote:

Robert Haas escribió:

2010/2/4 KaiGai Kohei:

(2010/02/04 0:20), Robert Haas wrote:

2010/2/1 KaiGai Kohei:

I again wonder whether we are on the right direction.


I believe the proposed approach is to dump blob metadata if and only
if you are also dumping blob contents, and to do all of this for data
dumps but not schema dumps.  That seems about right to me.


In other words:

 ->  blob contents and metadata (owner, acl, comments) shall
  be dumped
  --data-only   ->  only blob contents shall be dumped
  --schema-only ->  neither blob contents and metadata are dumped.

Can I understand correctly?


No, that's not what I said.  Please reread.  I don't think you should
ever dump blob contents without the metadata, or the other way around.


So:
default:both contents and metadata
--data-only:same
--schema-only:  neither

Seems reasonable.


OK... I'll try to update the patch, anyway.

However, it means only large object performs an exceptional object class
that dumps its owner, acl and comment even if --data-only is given.
Is it really what you suggested, isn't it?

Thanks,
--
KaiGai Kohei 

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] building tsquery directly in memory (avoid makepol)

2010-02-04 Thread Teodor Sigaev


Before doing it the trial and error way can somebody just make me an
example?
I'm not pretty sure about my interpretation of the comments of the
documentation.
TSQuery

[skipped]
Right, valcrc is computed in pushValue


I couldn't see any place in the code where TSQuery is built in "one
shot" in spite of using pushValue.

That because in all places we could parse rather complex structure. Simple OR-ed
query could be hardcoded as
pushValue('X')
pushValue('YY')
pushOperator(OP_OR);
pushValue('ZZZ')
pushOperator(OP_OR);

You need to call pushValue/pushOperator imagery order of polish notation.
Note, you can do another order:
pushValue('X')
pushValue('YY')
pushValue('ZZZ')
pushOperator(OP_OR);
pushOperator(OP_OR);

So, first example will produce ( X | YY ) | ZZZ, second one  X | ( YY | XXX )






Another thing I'd like to know is: what is going to be preferred
during a scan between
'java:1A,2B '::tsvector @@ to_tsquery('java:A | java:B');
vs.
'java:1A,2B '::tsvector @@ to_tsquery('java:AB')
?
they look equivalent. Are they?


Yes, but second one should be more efficient.
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing cvs HEAD - HS/SR - cannot stat

2010-02-04 Thread Heikki Linnakangas

Magnus Hagander wrote:
> 2010/2/4 Heikki Linnakangas :
>> Josh Berkus wrote:
>>> Can we improve the error message?  Right now it's alarming people.  Such as:
>>>
>>> cannot stat
>>> `/var/data1/pg_stuff/dump/replication_archive/00010002':
>>> End of Log
>> Not really, it's coming from 'cp'. Not sure if we could capture the
>> stderr and somehow decorate the message to make it less alarming.
>>
>> This isn't really a new problem, PITR has always done that, but I agree
>> it's always been a bit clumsy to have those messages in the log.
> 
> Can we follow it up with a well-phrased message that we switched to
> streaming mode right after it, so people realize it's not a problem?

Yeah, that would be a good idea. We should give the messages we emit
some holistic thinking. There's quite a few new messages printed thanks
to both Hot Standby and Streaming replication, and it seems we still
need a few more. Which ones are really required, and which ones are just
noise?

Here's what I see in a test standby server:

> LOG:  database system was interrupted while in recovery at log time 
> 2010-02-04 20:45:40 EET
> HINT:  If this has occurred more than once some data might be corrupted and 
> you might need to choose an earlier recovery target.

Can we avoid printing this? It's a bit alarming to talk about corrupted
data.

> LOG:  starting archive recovery
> LOG:  restore_command = 'cp /home/hlinnaka/pgsql.cvshead/walarchive/%f %p'
> LOG:  standby_mode = 'true'
> LOG:  primary_conninfo = 'host=localhost port=5432 user=rep_user 
> password=reppass'
> LOG:  trigger_file = '/tmp/standby-trigger'

Do we really need to echo all the lines in recovery.conf? That might be
interesting information, but perhaps it could be condensed and worded
more nicely.

> cp: cannot stat 
> `/home/hlinnaka/pgsql.cvshead/walarchive/00010007': No such 
> file or directory

This is the noise line Josh started this thread with.

> LOG:  automatic recovery in progress

Ok, so what?

> LOG:  initializing recovery connections

Seems like unnecessary noise, doesn't mean anything to a DBA.

> LOG:  redo starts at 0/700140C

I guess this could be useful debug information sometimes.

> LOG:  consistent recovery state reached at 0/700142C

It's nice to know that it has reached consistency, but was there any way
to know before this line that it hadn't been reached yet? Perhaps the
"redo starts" line should mention that consistency hasn't been reached yet.

> LOG:  database system is ready to accept read only connections

This is an important piece of information with Hot Standby.

Now, should we print a line when we connect to the master successfully?
Seems like useful information.

Then there's the LOG lines whenever a file is restored successfully from
archive; are the necessary anymore, now that you can connect to the
standby and use pg_last_xlog_replay_location() to poll its status?

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing cvs HEAD - HS/SR - cannot stat

2010-02-04 Thread Erik Rijkers

On Thu, February 4, 2010 19:29, Heikki Linnakangas wrote:
> Josh Berkus wrote:
>> Can we improve the error message?  Right now it's alarming people.  Such as:
>>
>> cannot stat
>> `/var/data1/pg_stuff/dump/replication_archive/00010002':
>> End of Log
>
> Not really, it's coming from 'cp'. Not sure if we could capture the
> stderr and somehow decorate the message to make it less alarming.
>
> This isn't really a new problem, PITR has always done that, but I agree
> it's always been a bit clumsy to have those messages in the log.


It was (eventually) clear to me that it was not really a problem,
but only because I was testing anyway :)

I think most people would do what I did: search the documentation
for the string 'No such file or directory' or 'cannot stat'.
(But that produces no mention of this (non-)problem.)

ISTM mentioning in the documentation is good enough.



Erik Rijkers














-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing cvs HEAD - HS/SR - cannot stat

2010-02-04 Thread Magnus Hagander

2010/2/4 Heikki Linnakangas :
> Josh Berkus wrote:
>> Can we improve the error message?  Right now it's alarming people.  Such as:
>>
>> cannot stat
>> `/var/data1/pg_stuff/dump/replication_archive/00010002':
>> End of Log
>
> Not really, it's coming from 'cp'. Not sure if we could capture the
> stderr and somehow decorate the message to make it less alarming.
>
> This isn't really a new problem, PITR has always done that, but I agree
> it's always been a bit clumsy to have those messages in the log.

Can we follow it up with a well-phrased message that we switched to
streaming mode right after it, so people realize it's not a problem?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Largeobject Access Controls (r2460)

2010-02-04 Thread Alvaro Herrera

Robert Haas escribió:
> 2010/2/4 KaiGai Kohei :
> > (2010/02/04 0:20), Robert Haas wrote:
> >> 2010/2/1 KaiGai Kohei:
> >>> I again wonder whether we are on the right direction.
> >>
> >> I believe the proposed approach is to dump blob metadata if and only
> >> if you are also dumping blob contents, and to do all of this for data
> >> dumps but not schema dumps.  That seems about right to me.
> >
> > In other words:
> >
> >       -> blob contents and metadata (owner, acl, comments) shall
> >                  be dumped
> >  --data-only   -> only blob contents shall be dumped
> >  --schema-only -> neither blob contents and metadata are dumped.
> >
> > Can I understand correctly?
> 
> No, that's not what I said.  Please reread.  I don't think you should
> ever dump blob contents without the metadata, or the other way around.

So:
default:both contents and metadata
--data-only:same
--schema-only:  neither

Seems reasonable.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing cvs HEAD - HS/SR - cannot stat

2010-02-04 Thread Heikki Linnakangas

Josh Berkus wrote:
> Can we improve the error message?  Right now it's alarming people.  Such as:
> 
> cannot stat
> `/var/data1/pg_stuff/dump/replication_archive/00010002':
> End of Log

Not really, it's coming from 'cp'. Not sure if we could capture the
stderr and somehow decorate the message to make it less alarming.

This isn't really a new problem, PITR has always done that, but I agree
it's always been a bit clumsy to have those messages in the log.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] building tsquery directly in memory (avoid makepol)

2010-02-04 Thread Ivan Sergio Borgonovo

I know in advance the structure of a whole tsquery, it has already
been reduced and lexemes have been already computed.
I'd like to directly write it in memory without having to pass
through pushValue/makepol.

Anyway I'm not pretty sure about what is the layout of a tsquery in
memory and I still haven't been able to find the MACRO that could
help me [1].

Before doing it the trial and error way can somebody just make me an
example?
I'm not pretty sure about my interpretation of the comments of the
documentation.

This is how I'd write
X:AB | YY:C | ZZZ:D

TSQuery
  vl_len_ (total # of bytes of the whole following structure
  QueryItems*size + total lexeme length)
  size (# of QueryItems in the query)
  QueryItem
type QI_OPR
oper OP_OR
left -> distance from QueryItem X:AB
  QueryItem
type QI_OPR
oper OP_OR
left -> distance from QueryItem ZZZ:D
  QueryItem (X) 
type QI_VAL
weight 1100
valcrc ???
lenght 1
distance
  QueryItem (YY)
type QI_VAL
weight 0010
valcrc ???
lenght 2
distance
  QueryItem (ZZZ)
type QI_VAL
weight 0001
valcrc ???
lenght 3
distance
  X
  YY
  ZZZ

[1] the equivalent of POSTDATALEN, WEP_GETWEIGHT, macro to compute
the size of various parts of TSQuery etc...

I couldn't see any place in the code where TSQuery is built in "one
shot" in spite of using pushValue.

Another thing I'd like to know is: what is going to be preferred
during a scan between
'java:1A,2B '::tsvector @@ to_tsquery('java:A | java:B');
vs.
'java:1A,2B '::tsvector @@ to_tsquery('java:AB')
?
they look equivalent. Are they?

thanks

-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing cvs HEAD - HS/SR - cannot stat

2010-02-04 Thread Josh Berkus


> Yeah, this is not a bug.
> 
> At first, the standby performs an archive recovery until an invalid
> WAL record is found. Then it starts replication and tries to receive
> the missing WAL records from the primary. So such an error message
> would be logged whenever an invalid record is found and replication
> is started.

Can we improve the error message?  Right now it's alarming people.  Such as:

cannot stat
`/var/data1/pg_stuff/dump/replication_archive/00010002':
End of Log

Of course, it would be even better to supress this error message
entirely unless the user is at DEBUG1 or better.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Assorted cleanups in preparation for using a map file to support

2010-02-04 Thread Simon Riggs

On Thu, 2010-02-04 at 13:02 -0500, Tom Lane wrote:
> Simon Riggs  writes:
> > On Wed, 2010-02-03 at 10:48 -0500, Tom Lane wrote:
> >>> If so, there is some minor code cleanup and comment changes in
> >>> ProcessCommittedInvalidationMessages(). Would you like me to do that, or
> >>> should we wait?
> >> 
> >> I saw that.  I didn't touch it because it's not directly relevant to
> >> what I'm doing right now, but I would like to go back and see whether
> >> that routine can't be got rid of completely.  It seems to me to be a
> >> very klugy substitute for having enough information.  I'm inclined to
> >> think that we should emit an sinval message (or maybe better a separate
> >> WAL entry) for initfile removal, instead of trying to reverse-engineer
> >> whether one happened.
> 
> > An additional sinval message type would work. There is a requirement for
> > us to run RelationCacheInitFileInvalidate() both before and after the
> > other messages. So we would need to append and prepend the new message
> > type onto the array of messages if transInvalInfo->RelcacheInitFileInval
> > is true. That way we would just do SendSharedInvalidMessages() in
> > xact_redo_commit and remove ProcessCommittedInvalidationMessages(),
> > adding other code to handle the inval message. Doesn't seem any easier
> > though.
> 
> > Another WAL record would definitely be cumbersome.
> 
> BTW, we're definitely going to have to do *something* with that code,
> because it's assuming that non-shared relcache init files always live in
> DEFAULTTABLESPACE.  That is not correct.

Oh dear.

>   I think that there is no
> simple way for the startup process to identify which tablespace a given
> database lives in (normally, one would have to consult pg_database to
> find that out).  So we are going to have to drive this off an sinval or
> WAL record that does provide the tablespace as well as the DB OID.

Seems OK to just add the tablespace to the sinval.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Assorted cleanups in preparation for using a map file to support

2010-02-04 Thread Tom Lane

Simon Riggs  writes:
> On Wed, 2010-02-03 at 10:48 -0500, Tom Lane wrote:
>>> If so, there is some minor code cleanup and comment changes in
>>> ProcessCommittedInvalidationMessages(). Would you like me to do that, or
>>> should we wait?
>> 
>> I saw that.  I didn't touch it because it's not directly relevant to
>> what I'm doing right now, but I would like to go back and see whether
>> that routine can't be got rid of completely.  It seems to me to be a
>> very klugy substitute for having enough information.  I'm inclined to
>> think that we should emit an sinval message (or maybe better a separate
>> WAL entry) for initfile removal, instead of trying to reverse-engineer
>> whether one happened.

> An additional sinval message type would work. There is a requirement for
> us to run RelationCacheInitFileInvalidate() both before and after the
> other messages. So we would need to append and prepend the new message
> type onto the array of messages if transInvalInfo->RelcacheInitFileInval
> is true. That way we would just do SendSharedInvalidMessages() in
> xact_redo_commit and remove ProcessCommittedInvalidationMessages(),
> adding other code to handle the inval message. Doesn't seem any easier
> though.

> Another WAL record would definitely be cumbersome.

BTW, we're definitely going to have to do *something* with that code,
because it's assuming that non-shared relcache init files always live in
DEFAULTTABLESPACE.  That is not correct.  I think that there is no
simple way for the startup process to identify which tablespace a given
database lives in (normally, one would have to consult pg_database to
find that out).  So we are going to have to drive this off an sinval or
WAL record that does provide the tablespace as well as the DB OID.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Largeobject Access Controls (r2460)

2010-02-04 Thread Robert Haas

2010/2/4 KaiGai Kohei :
> (2010/02/04 0:20), Robert Haas wrote:
>> 2010/2/1 KaiGai Kohei:
>>> I again wonder whether we are on the right direction.
>>
>> I believe the proposed approach is to dump blob metadata if and only
>> if you are also dumping blob contents, and to do all of this for data
>> dumps but not schema dumps.  That seems about right to me.
>
> In other words:
>
>       -> blob contents and metadata (owner, acl, comments) shall
>                  be dumped
>  --data-only   -> only blob contents shall be dumped
>  --schema-only -> neither blob contents and metadata are dumped.
>
> Can I understand correctly?

No, that's not what I said.  Please reread.  I don't think you should
ever dump blob contents without the metadata, or the other way around.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Shared catalogs vs pg_global tablespace

2010-02-04 Thread Robert Haas

On Thu, Feb 4, 2010 at 10:30 AM, Heikki Linnakangas
 wrote:
> Tom Lane wrote:
>> I'm not sure whether allowing that would be good or bad.  I see no
>> obvious killer reason why it'd be bad, but it seems like the kind of
>> thing we might regret someday.  pg_global is in some sense an
>> implementation artifact, so allowing users to depend on it might be
>> bad in the long run.
>
> Agreed, it feels scary to allow it.

+1.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 9.0 and standard_conforming_strings

2010-02-04 Thread Andrew Dunstan




Robert Haas wrote:

On Wed, Feb 3, 2010 at 5:57 PM, Andrew Dunstan  wrote:
  

marcin mank wrote:


A certain prominent web framework has a nasty SQL injection bug when
PG is configured with SCS. This bug is not present without SCS
(details per email for interested PG hackers). I say, hold it off.
  

Any web framework that interpolates user supplied values into SQL rather
than using placeholders is broken from the get go, IMNSHO. I'm not saying
that there aren't reasons to hold up moving to SCS, but this isn't one of
them.



That seems more than slightly harsh.  I've certainly come across
situations where interpolating values (with proper quoting of course)
made more sense than using placeholders.  YMMV, of course.


  


How many injection attacks should we witness before deciding that the 
best defence is to get out of the quoting/escaping game? Personally I 
have reached that threshold.


Remember that this is a web *framework*, something that would ideally be 
using best practice and heightened security awareness. There could be 
cases where some applications with well known structures and queries 
interpolate carefully sanitised values into SQL, but I very much doubt 
that web app frameworks should be indulging in such practices. They 
should go the extra mile, IMNSHO.


Anyway, I think this conversation is going slightly astray.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 9.0 and standard_conforming_strings

2010-02-04 Thread David E. Wheeler

On Feb 3, 2010, at 6:16 PM, Robert Haas wrote:

>> Any web framework that interpolates user supplied values into SQL rather
>> than using placeholders is broken from the get go, IMNSHO. I'm not saying
>> that there aren't reasons to hold up moving to SCS, but this isn't one of
>> them.
> 
> That seems more than slightly harsh.  I've certainly come across
> situations where interpolating values (with proper quoting of course)
> made more sense than using placeholders.  YMMV, of course.

Not if it leads to Little Bobby Tables's door when, you know, you use SQL 
conformant strings! Sounds like an app that needs its quoting function fixed.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: NaN/Inf fix for ECPG Re: [HACKERS] out-of-scope cursor errors

2010-02-04 Thread Michael Meskes

On Thu, Feb 04, 2010 at 03:55:06PM +0100, Boszormenyi Zoltan wrote:
> I added the #include  to the nan_test.pgc in the hope
> it fixes the Windows machines in the buildfarm.

I already commited this earlier today after looking at the problem myself. But
of course I'm also just hoping as I do not have a Windows build system either.

So could you please re-diff?

> The patch also contains cleanups in the outofscope and sqlda
> regression tests so they do
>   #include 
> instead of
>   exec sql include pgtypes_numeric.h;

Is there a reason for this?

Michael

-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber mes...@jabber.org
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Writeable CTEs documentation patch

2010-02-04 Thread Marko Tiikkaja

Hi,

Attached is a documentation patch for writeable CTEs.

Most of it is explaining how this feature works in select.sgml.  I
wasn't sure if that's the right place, but couldn't find a better one.
I also wasn't able to find any place discussing the command tag, other
than libpq's documentation.  Is there one somewhere?

While working on the docs, I noticed one problem with the patch itself:
it doesn't handle multi-statement DO INSTEAD rules correctly.  I'm going
to submit a fix for that later.

Any suggestions, whatsoever, are welcome.


Regards,
Marko Tiikkaja
*** a/doc/src/sgml/queries.sgml
--- b/doc/src/sgml/queries.sgml
***
*** 1530,1538  SELECT select_list FROM 
table_expression
  

 WITH provides a way to write subqueries for use in a larger
!SELECT query.  The subqueries can be thought of as defining
!temporary tables that exist just for this query.  One use of this feature
!is to break down complicated queries into simpler parts.  An example is:
  
  
  WITH regional_sales AS (
--- 1530,1538 
  

 WITH provides a way to write subqueries for use in a larger
!query.  The subqueries can be thought of as defining temporary tables
!that exist just for this query.  One use of this feature is to break down
!complicated queries into simpler parts.  An example is:
  
  
  WITH regional_sales AS (
***
*** 1560,1565  GROUP BY region, product;
--- 1560,1586 

  

+WITH clauses are not restricted to only SELECT
+queries; you can also use INSERT, UPDATE or
+DELETE.  This allows you to perform many different operations
+in the same query.  An example of this is:
+ 
+ 
+ WITH rows AS (
+ DELETE FROM ONLY products
+ WHERE
+ "date" >= '2009-10-01' AND
+ "date" <  '2009-11-01
+ RETURNING *
+ )
+ INSERT INTO products_log
+ SELECT * FROM rows;
+ 
+ 
+which moves rows from products to products_log.
+   
+ 
+   
 The optional RECURSIVE modifier changes WITH
 from a mere syntactic convenience into a feature that accomplishes
 things not otherwise possible in standard SQL.  Using
*** a/doc/src/sgml/ref/create_rule.sgml
--- b/doc/src/sgml/ref/create_rule.sgml
***
*** 222,227  CREATE [ OR REPLACE ] RULE name AS
--- 222,234 

  

+In an INSERT, UPDATE or
+DELETE query within a WITH clause,
+only unconditional, single-statement INSTEAD rules are
+implemented.
+   
+ 
+   
 It is very important to take care to avoid circular rules.  For
 example, though each of the following two rule definitions are
 accepted by PostgreSQL, the
*** a/doc/src/sgml/ref/delete.sgml
--- b/doc/src/sgml/ref/delete.sgml
***
*** 21,30  PostgreSQL documentation
--- 21,36 
  
   
  
+ [ WITH [ RECURSIVE ] with_query 
[, ...] ]
  DELETE FROM [ ONLY ] table [ [ 
AS ] alias ]
  [ USING using_list ]
  [ WHERE condition | WHERE 
CURRENT OF cursor_name ]
  [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]
+ 
+ where with_query 
is:
+ 
+ with_query_name [ ( 
column_name [, ...] ) ] AS ( 
select | insert | update | delete ) 
+ 
  
   
  
***
*** 84,89  DELETE FROM [ ONLY ] table [ [ AS ]
--- 90,104 
  

 
+ with_query
+ 
+  
+   For information about with_query, see
+   .
+  
+ 
+
+
  ONLY
  
   
*** a/doc/src/sgml/ref/insert.sgml
--- b/doc/src/sgml/ref/insert.sgml
***
*** 21,29  PostgreSQL documentation
--- 21,36 
  
   
  
+ [ WITH [ RECURSIVE ] with_query 
[, ...] ]
  INSERT INTO table [ ( 
column [, ...] ) ]
  { DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | 
query }
  [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]
+ 
+ where with_query 
is:
+ 
+ with_query_name [ ( 
column_name [, ...] ) ] AS ( 
select | insert | update | delete )   
+ 
+ 
  
   
  
***
*** 85,90  INSERT INTO table [ 
( 
 
+ with_query
+ 
+  
+   For information about with_query, see
+   .
+  
+ 
+
+
  table
  
   
*** a/doc/src/sgml/ref/select.sgml
--- b/doc/src/sgml/ref/select.sgml
***
*** 58,64  SELECT [ ALL | DISTINCT [ ON ( expressionand with_query 
is:
  
! with_query_name [ ( 
column_name [, ...] ) ] AS ( 
select )
  
  TABLE { [ ONLY ] table_name [ * 
] | with_query_name }
  
--- 58,64 
  
  and with_query 
is:
  
! with_query_name [ ( 
column_name [, ...] ) ] AS ( 
select | insert | update | delete )
  
  TABLE { [ ONLY ] table_name [ * 
] | with_query_name }
  
***
*** 202,209  TABLE { [ ONLY ] table_name [ * ] |
  
 
  The WITH clause allows you to specify one or more
! subqueries that can be referenced by name in the primary query.
! The subqueries effectively act as temporary tables or views
  for the duration of the primary query.
 
  
--- 2

Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-04 Thread Teodor Sigaev


I would like to see point #2 of the following email addressed before
commit.  As things stand, it is not clear (at least to me) whether
this is a win.


Reimplementation of ginInsertRecordBA reduces difference of HEAD and HEAD+rbtree 
in regular case.

Test suite is taken from http://www.sai.msu.su/~megera/wiki/2009-04-03:

SEQ: SELECT array_to_string(ARRAY(select '' || a || '.' || b from
generate_series(1,50) b), ' ')::tsvector AS i INTO foo FROM
generate_series(1,10) a;
RND: SELECT array_to_string(ARRAY(select '' || random() from
generate_series(1,50) b), ' ')::tsvector AS i INTO foo FROM
generate_series(1,10) a;

Times in seconds:
 HEAD  0.9   0.11
SEQ   130  113111
RND11.4 12.6   11.5

The ides was to change order of insertion - now insertion order decreases number 
of rebalancing.


Oleg's test (http://www.sai.msu.su/~megera/wiki/rbtree_test) are made with v0.10 
which is differ from 0.11 only by comments around ginInsertRecordBA()

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


rbtree-0.11.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Recent vendor SSL renegotiation patches break PostgreSQL

2010-02-04 Thread Chris Campbell

On Feb 3, 2010, at 10:16 AM, Stefan Kaltenbrunner wrote:

> Robert Haas wrote:
>> On Wed, Feb 3, 2010 at 6:24 AM, Chris Campbell  
>> wrote:
>>> The flurry of patches that vendors have recently been making to OpenSSL to 
>>> address
>>> the potential man-in-the-middle attack during SSL renegotiation have 
>>> disabled SSL
>>> renegotiation altogether in the OpenSSL libraries. Applications that make 
>>> use of SSL
>>> renegotiation, such as PostgreSQL, start failing.
>> Should we think about adding a GUC to disable renegotiation until this
>> blows over?
> 
> hmm I wonder if we should not go as far as removing the whole renegotiation 
> code, from the field it seems that there are very very few daemons actually 
> doing that kind forced renegotiation.

There was a discussion about the relevance and consequences of SSL 
renegotiation on this list back in 2003:

http://archives.postgresql.org/pgsql-interfaces/2003-04/msg00075.php

Personally, my production servers have been patched to remove renegotiation 
completely, and I’m comfortable with the consequences of that for my usage.

- Chris


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Shared catalogs vs pg_global tablespace

2010-02-04 Thread Heikki Linnakangas

Tom Lane wrote:
> I'm not sure whether allowing that would be good or bad.  I see no
> obvious killer reason why it'd be bad, but it seems like the kind of
> thing we might regret someday.  pg_global is in some sense an
> implementation artifact, so allowing users to depend on it might be
> bad in the long run.

Agreed, it feels scary to allow it.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-04 Thread Oleg Bartunov


I'm in progress of preparing this page
http://www.sai.msu.su/~megera/wiki/rbtree_test

Hope, tests are easy to reproduce.


This is slightly improved version of rbtree patch, Teodor didn't commit yet.
Random array test and real-life examples are ok, I still working on 
test #1, which is quite artificial test, but still I want to understand if

the results are in accuracy of test.

Oleg

On Thu, 4 Feb 2010, Mark Cave-Ayland wrote:


Robert Haas wrote:


Maybe we are now getting to the heart of the confusion.  Mark wrote in
his email: "Unfortunately I was not really able to reproduce the RND
(teodor's) dataset, nor the random array test as the SQL used to test
the implementation was not present on the page above."  The SQL for
the fixed-length tests is posted, but the SQL for the variable length
test is not - so Mark was just guessing on that one.

Or am I just totally confused?

...Robert


No, that's correct. In the "Repeat test with 100,000 identical records 
varying array length (len)" section, it's fairly easy to substitute in the 
varying values of len where len = 3, 30 and 50. As documented in my review 
email I had a guess at generating the contents of RND (teodor's) column with 
this query:


select ARRAY(select generate_series(1, (random() * 100)::int)) as arand into 
arrrand from generate_series(1,10) b;


However, unlike the other figures this is quite a bit different from 
Oleg/Teodor's results which make me think this is the wrong query (3.5s v 
9s). Obviously Robert's concern here is that it is this column that shows one 
of the largest performance decreases compared to head.


I've also finished benchmarking the index creation scripts yesterday on 
Oleg's test dataset from 
http://www.sai.msu.su/~megera/postgres/files/links2.sql.gz. With 
maintenance_work_mem set to 256Mb, the times I got with the rbtree patch 
applied were:



rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin);
CREATE INDEX
Time: 1910741.352 ms

rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout);
CREATE INDEX
Time: 1647609.300 ms


Without the patch applied, I ended up having to shutdown my laptop after 
around 90 mins before the first index had even been created. So there is a 
definite order of magnitude speed increase with this patch applied.



ATB,

Mark.




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Shared catalogs vs pg_global tablespace

2010-02-04 Thread Tom Lane

So while playing with cluster/vac full on shared catalogs I ran into a
small catch-22.  If the transient table created by make_new_heap is
marked relisshared, the code complains because that OID is not known
to IsSharedRelation.  If it's not marked shared, you run into various
error checks and Asserts that insist that only shared tables should
be in the pg_global tablespace.  Obviously we must put the new table
instance into pg_global, so one or the other type of sanity check
is going to have to be relaxed.

On reflection it seems to me that it's most correct to not mark the
transient table shared: it really isn't, because only the local copy
of pg_class knows anything about it.  If we were to take out a
shared-namespace lock on it, that lock would conflict against someone
coincidentally using the same OID for similar purposes in a different
database, and there's no need for that.

On the other side, I think that the checking about pg_global was just
introduced to try to catch coding errors that would put a table in the
wrong tablespace.  But that code is all long since debugged.

So I'm of the opinion that we need to back off the checks that insist
only shared tables can be in pg_global.  We could allow that either
by pushing the checks up a level (eg, from heap_create_with_catalog
to DefineRelation) or by removing them altogether.  The latter would
mean that users could create ordinary tables in the pg_global
tablespace.  By default, only superusers could, for lack of public usage
permissions on pg_global.

I'm not sure whether allowing that would be good or bad.  I see no
obvious killer reason why it'd be bad, but it seems like the kind of
thing we might regret someday.  pg_global is in some sense an
implementation artifact, so allowing users to depend on it might be
bad in the long run.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest Status Summary - 2010-02-03

2010-02-04 Thread Mark Cave-Ayland


Robert Haas wrote:


Here's an overview of where we stand with the remaining 14 patches,
according to my best understanding of the situation.



* rbtree - I have done a lot of work reviewing this, and Mark
Cave-Ayland has done some work on it, too.  But there are some
unanswered performance questions that need to be addressed before
commit.  This is another one that could really use some more eyes on
it.



* knngist - The third remaining big patch.  Mark Cave-Ayland
volunteered to review this one, too, but so far no review has been
posted on -hackers.  I know that the PostGIS folks would really like
to have this, but time is growing short.


Yes. I'm currently working on the knngist patch now, although sadly work 
got side-tracked onto the rbtree patch since it is a requirement for the 
knngist patch.


Now that the rbtree patch is in reasonably good shape, I intend to focus 
the rest of my time working on the knngist patch exclusively.



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-04 Thread Mark Cave-Ayland


Robert Haas wrote:


Maybe we are now getting to the heart of the confusion.  Mark wrote in
his email: "Unfortunately I was not really able to reproduce the RND
(teodor's) dataset, nor the random array test as the SQL used to test
the implementation was not present on the page above."  The SQL for
the fixed-length tests is posted, but the SQL for the variable length
test is not - so Mark was just guessing on that one.

Or am I just totally confused?

...Robert


No, that's correct. In the "Repeat test with 100,000 identical records 
varying array length (len)" section, it's fairly easy to substitute in 
the varying values of len where len = 3, 30 and 50. As documented in my 
review email I had a guess at generating the contents of RND (teodor's) 
column with this query:


select ARRAY(select generate_series(1, (random() * 100)::int)) as arand 
into arrrand from generate_series(1,10) b;


However, unlike the other figures this is quite a bit different from 
Oleg/Teodor's results which make me think this is the wrong query (3.5s 
v 9s). Obviously Robert's concern here is that it is this column that 
shows one of the largest performance decreases compared to head.


I've also finished benchmarking the index creation scripts yesterday on 
Oleg's test dataset from 
http://www.sai.msu.su/~megera/postgres/files/links2.sql.gz. With 
maintenance_work_mem set to 256Mb, the times I got with the rbtree patch 
applied were:



rbtest=# CREATE INDEX idin_rbtree_idx ON links2 USING gin (idin);
CREATE INDEX
Time: 1910741.352 ms

rbtest=# CREATE INDEX idout_rbtree_idx ON links2 USING gin (idout);
CREATE INDEX
Time: 1647609.300 ms


Without the patch applied, I ended up having to shutdown my laptop after 
around 90 mins before the first index had even been created. So there is 
a definite order of magnitude speed increase with this patch applied.



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Largeobject Access Controls (r2460)

2010-02-04 Thread KaiGai Kohei

(2010/02/04 17:30), KaiGai Kohei wrote:
> (2010/02/04 0:20), Robert Haas wrote:
>> 2010/2/1 KaiGai Kohei:
>>> I again wonder whether we are on the right direction.
>>
>> I believe the proposed approach is to dump blob metadata if and only
>> if you are also dumping blob contents, and to do all of this for data
>> dumps but not schema dumps.  That seems about right to me.
> 
> In other words:
> 
> ->  blob contents and metadata (owner, acl, comments) shall
>be dumped
>   --data-only   ->  only blob contents shall be dumped
>   --schema-only ->  neither blob contents and metadata are dumped.
> 
> Can I understand correctly?

The attached patch enables not to dump "BLOB ITEM" section and corresponding
metadata when --data-only is specified. In addition, it does not output
both "BLOB DATA" and "BLOB ITEM" section when --schema-only is specified.

When --data-only is given to pg_dump, it does not construct any DO_BLOB_ITEM
entries in getBlobs(), so all the metadata (owner, acls, comment) are not
dumped. And it writes legacy "BLOBS" section instead of the new "BLOB DATA"
section to inform pg_restore this archive does not create large objects in
"BLOB ITEM" section.
If --schema-only is given, getBlobs() is simply skipped.

When --data-only is given to pg_restore, it skips all the "BLOB ITEM" sections.
Large objects are created in _LoadBlobs() instead of the section, like as we
have done until now.
The _LoadBlobs() takes the third argument which specifies whether we should
create large object here, or not. Its condition is a bit modified from the
previous patch.

  if (strcmp(te->desc, "BLOBS") == 0 || ropt->dataOnly)
  _LoadBlobs(AH, ropt, true);^
  else if (strcmp(te->desc, "BLOB DATA") == 0)
  _LoadBlobs(AH, ropt, false);

When --data-only is given to pg_restore, "BLOB ITEM" secition is skipped,
so we need to create large objects at _LoadBlobs() stage, even if the
archive has "BLOB DATA" section.

In addition, --schema-only kills all the "BLOB ITEM" section using a special
condition that was added to _tocEntryRequired().

It might be a bit different from what Itagaki-san suggested, because
"BLOB ITEM" section is still in SECTION_PRE_DATA section.
However, it minimizes special treatments in the code, and no differences
from the viewpoint of end-users.

Or, is it necessary to pack them into SECTION_DATA section anyway?

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei 

pgsql-fix-pg_dump-blob-privs.4.patch
Description: application/octect-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Largeobject Access Controls (r2460)

2010-02-04 Thread KaiGai Kohei

(2010/02/04 0:20), Robert Haas wrote:
> 2010/2/1 KaiGai Kohei:
>> I again wonder whether we are on the right direction.
> 
> I believe the proposed approach is to dump blob metadata if and only
> if you are also dumping blob contents, and to do all of this for data
> dumps but not schema dumps.  That seems about right to me.

In other words:

  -> blob contents and metadata (owner, acl, comments) shall
  be dumped
 --data-only   -> only blob contents shall be dumped
 --schema-only -> neither blob contents and metadata are dumped.

Can I understand correctly?

>> Originally, the reason why we decide to use per blob toc entry was
>> that "BLOB ACLS" entry needs a few exceptional treatments in the code.
>> But, if we deal with "BLOB ITEM" entry as data contents, it will also
>> need additional exceptional treatments.
> 
> But the new ones are less objectionable, maybe.
> 
> ...Robert
> 


-- 
OSS Platform Development Division, NEC
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

41 matches

Mail list logo