date:20100607

On Mon, Jun 7, 2010 at 5:42 AM, Andrew Dunstan and...@dunslane.net wrote:

 The docs state, w.r.t. $subject:

   The password can be provided either in the primary_conninfo string
   or in a separate ~/.pgpass file on the standby server.

 I tried this with a database name of replication in the .pgpass file,
 which matches what we need to use in pg_hba.conf, but it failed miserably,
 and only worked when I used a wildcard for the database name in the .pgpass
 file.

 If this is expected it needs to be documented more clearly; if not, it's a
 bug.

Yep, this is expected, so we need to improve the doc. What about:


diff --git a/doc/src/sgml/high-availability.sgml
b/doc/src/sgml/high-availability.sgml
index 5c0d9ab..458a4e2 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -819,7 +819,9 @@ hostreplication foo
192.168.1.100/32md5
 para
  The host name and port number of the primary, connection user name,
  and password are specified in the filenamerecovery.conf/ file or
- the corresponding environment variable on the standby.
+ in a separate filename~/.pgpass/ on the standby (In the latter case,
+ literaldatabase/ field in a filename~/.pgpass/ file must be
+ literal*/).
  For example, if the primary is running on host IP
literal192.168.1.50/,
  port literal5432/literal, the superuser's name for replication is
  literalfoo/, and the password is literalfoopass/, the
administrator
diff --git a/doc/src/sgml/recovery-config.sgml
b/doc/src/sgml/recovery-config.sgml
index 439db3f..cc351f8 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -268,9 +268,11 @@ restore_command = 'copy
C:\\server\\archivedir\\%f %p'  # Windows
   primary (see
   xref linkend=streaming-replication-authentication).
   A password needs to be provided too, if the primary demands password
-  authentication.  (The password can be provided either in
+  authentication.  The password can be provided either in
   the varnameprimary_conninfo/varname string or in a separate
-  filename~/.pgpass/ file on the standby server.)
+  filename~/.pgpass/ file on the standby server (in the
latter case,
+  literaldatabase/ field in a filename~/.pgpass/ file must be
+  literal*/).
   Do not specify a database name in the
   varnameprimary_conninfo/varname string.
  /para

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Heikki Linnakangas


On 07/06/10 06:06, Stephen Frost wrote:

Also, perhaps I'm not being paranoid enough, but all this concern over
error cases really doesn't really worry me that much.  The amount of
data one could acquire that way is pretty limited.


It's not limited. It allows you to read all contents of the underlying 
table or tables. I don't see much point doing anything at all if we 
don't plug that.


There's many side channels like exposing row counts in EXPLAIN and 
statistics and timing attacks, that are not as critical, because they 
don't let expose all data, and the attacker can't accurately choose what 
data is exposed. Those are not as important.



--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exporting raw parser

2010-06-07 Thread Daniel Farina

On Wed, May 26, 2010 at 6:02 PM, Tatsuo Ishii is...@postgresql.org wrote:
 I'm thinking about exporting the raw parser and related modules as a C
 library. Though this will not be an immediate benefit of PostgreSQL
 itself, it will be a huge benefit for any PostgreSQL
 applications/middle ware those need to parse SQL statements.

In the past I and people I have known/worked with have made strategic
use of UDFs running on a live server that return the parse tree,
semantically analyzed tree, and planned tree (I think) outNode textual
representation for various projects, and found them highly useful.
Syntactic, semantic, and operational meaning of a query was useful for
our projects.

Some of this code was linked with the server, and so reading the node
using Postgres' parser was easy. Otherwise, a small parser needed be
written for external projects. Perhaps a slightly more ideal state of
affairs would be:

* These hooks to acquire the syntactic/semantic/planned trees would be
bundled for free
* When writing code not linked against the server, a more common
serialization format, ala JSON or whatnot

A more ambitious project that I don't think is in the scope of any
initial implementation would be to allow for cross referencing of
these compilation passes, similar to how GNU Bison allows you to
interrogate for the position of a lexeme when reporting errors. In my
experience, code written that mangles one layer (say, semantic, or
harder yet, plan) has a hard time doing the best error because getting
from a node at the bottom to the right lexeme(s) at the top is
very cumbersome. One could imagine this being useful for other
purposes too, but that is how I felt it firsthand. Feels a lot harder,
though.

fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nvarchar notation accepted?

2010-06-07 Thread Peter Eisentraut

On sön, 2010-06-06 at 21:13 -0500, Jaime Casanova wrote:
 On Thu, May 13, 2010 at 11:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Jaime Casanova ja...@2ndquadrant.com writes:
  On Thu, May 13, 2010 at 10:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Actually, the lexer translates N'foo' to NCHAR 'foo' and then the
  grammar treats that just like CHAR 'foo'.  In short, the N doesn't do
  anything very useful, and it certainly doesn't have any effect on
  encoding behavior.  I think this is something Tom Lockhart put in ten or
  so years back, and never got as far as making it actually do anything
  helpful.
 
  so, the N'' syntax is fine and i don't need to hunt them as a migration 
  step?
 
  As long as the implied cast to char(n) doesn't cause you problems, it's
  fine.
 
 
 Is this something we want to document? Maybe something like:
 
 For historical reasons N'' syntax is also accepted as a string literal.
 
 
 or we can even mention the fact that that is useful for sql server migrations?

I don't think it's a historical reason, at least not unless all reasons
are to some degree historical.  The N'' syntax is in the SQL standard,
and so if our implementation matches that, it should be documented as a
supported feature, and if it doesn't match it, we should fix it, and
perhaps leave it undocumented until we have figured out what we want it
to do.  (I have not done that analysis.)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

(2010/06/07 15:48), Heikki Linnakangas wrote:
 On 07/06/10 06:06, Stephen Frost wrote:
 Also, perhaps I'm not being paranoid enough, but all this concern over
 error cases really doesn't really worry me that much. The amount of
 data one could acquire that way is pretty limited.
 
 It's not limited. It allows you to read all contents of the underlying 
 table or tables. I don't see much point doing anything at all if we 
 don't plug that.
 
IIRC, Oracle pays attention not to expose function's arguments via
error messages due to the security reasons, although it focuses on
that does not provide a hint to attacker who tries SQL injection.

It seems to me it is a matter of degree.

If we allows to execute lowrite() inside of the join loop, it can
be abused to move massive invisible information into visible large
objects within a little time. So, its priority is relatively higher
than error messages.

However, we cannot move massive information via error messages in
a little time, although it may expose whole of the arguments.
So, its threat is relatively smaller.

 There's many side channels like exposing row counts in EXPLAIN and 
 statistics and timing attacks, that are not as critical, because they 
 don't let expose all data, and the attacker can't accurately choose what 
 data is exposed. Those are not as important.
 
It also means; because they can provide much smaller bandwidth to leak
invisible information than error messages, these are not as important.
Is it right?

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exporting raw parser

2010-06-07 Thread Dimitri Fontaine

Daniel Farina drfar...@acm.org writes:
 Some of this code was linked with the server, and so reading the node
 using Postgres' parser was easy. Otherwise, a small parser needed be
 written for external projects. Perhaps a slightly more ideal state of
 affairs would be:

 * These hooks to acquire the syntactic/semantic/planned trees would be
 bundled for free
 * When writing code not linked against the server, a more common
 serialization format, ala JSON or whatnot

Accessing to those data have been talked about with respect to DDL
triggers too. You want to be able to know what exactly is being
executed, and against what objects.

And you want to be able to abuse this information from either a C-coded
server function or a PLpgSQL trigger. I guess the WIP JSON datatype
would help a lot even when working from within the server, as that does
not mean working in C.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Heikki Linnakangas

On 07/06/10 10:30, KaiGai Kohei wrote:
 (2010/06/07 15:48), Heikki Linnakangas wrote:
 There's many side channels like exposing row counts in EXPLAIN and
 statistics and timing attacks, that are not as critical, because they
 don't let expose all data, and the attacker can't accurately choose what
 data is exposed. Those are not as important.

 It also means; because they can provide much smaller bandwidth to leak
 invisible information than error messages, these are not as important.
 Is it right?

The big difference is what information can be obtained, not how fast it
can be obtained.

Imagine a table that holds username/passwords for users. Each user is
allowed to see his own row, including password, but not anyone else's.
EXPLAIN side-channel might give pretty accurate information of how many
rows there is in the table, and via clever EXPLAIN+statistics probing
you might be able to find out what the top-10 passwords are, for
example. But if you wanted to know what your neighbor's password is, the
side-channels would not help you much, but an error message would reveal
it easily.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-06-07 Thread Pierre C


On Sunday 30 May 2010 18:29:31 Greg Stark wrote:

On Sun, May 30, 2010 at 4:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I read through that thread and couldn't find much discussion of
 alternative CRC implementations --- we spent all our time on arguing
 about whether we needed 64-bit CRC or not.


SSE4.2 has a hardware CRC32 instruction, this might be interesting to  
use...


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-06-07 Thread Andres Freund

On Monday 07 June 2010 12:37:13 Pierre C wrote:
  On Sunday 30 May 2010 18:29:31 Greg Stark wrote:
  On Sun, May 30, 2010 at 4:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:
   I read through that thread and couldn't find much discussion of
   alternative CRC implementations --- we spent all our time on arguing
   about whether we needed 64-bit CRC or not.
 
 SSE4.2 has a hardware CRC32 instruction, this might be interesting to
 use...
Different polynom unfortunately...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

Heikki,

* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
 The big difference is what information can be obtained, not how fast it
 can be obtained.

Actually, I disagree.  Time required to acquire the data does matter.

 Imagine a table that holds username/passwords for users. Each user is
 allowed to see his own row, including password, but not anyone else's.
 EXPLAIN side-channel might give pretty accurate information of how many
 rows there is in the table, and via clever EXPLAIN+statistics probing
 you might be able to find out what the top-10 passwords are, for
 example. But if you wanted to know what your neighbor's password is, the
 side-channels would not help you much, but an error message would reveal
 it easily.

Using only built-ins, could you elaborate on how one could pick exactly
what row was revealed using an error case?  That strikes me as
difficult, but perhaps I'm not thinking creatively enough.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Heikki Linnakangas


On 07/06/10 14:06, Stephen Frost wrote:

* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:

The big difference is what information can be obtained, not how fast it
can be obtained.


Actually, I disagree.  Time required to acquire the data does matter.


Depends on the magnitude, of course. If it takes 1 year per row, that's 
probably acceptable. If it takes 1 second, that's extremely slow 
compared to normal queries, but most likely still disastreous from a 
security point of view.



Imagine a table that holds username/passwords for users. Each user is
allowed to see his own row, including password, but not anyone else's.
EXPLAIN side-channel might give pretty accurate information of how many
rows there is in the table, and via clever EXPLAIN+statistics probing
you might be able to find out what the top-10 passwords are, for
example. But if you wanted to know what your neighbor's password is, the
side-channels would not help you much, but an error message would reveal
it easily.


Using only built-ins, could you elaborate on how one could pick exactly
what row was revealed using an error case?  That strikes me as
difficult, but perhaps I'm not thinking creatively enough.


WHERE should do it:

SELECT * FROM secrets_view WHERE username = 'neighbor' AND 
password::integer = 1234;

ERROR:  invalid input syntax for integer: neighborssecretpassword

Assuming that username = 'neighbor' is evaluated before the cast.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-06-07 Thread Florian Pflug

On Jun 7, 2010, at 12:45 , Andres Freund wrote:
 On Monday 07 June 2010 12:37:13 Pierre C wrote:
 On Sunday 30 May 2010 18:29:31 Greg Stark wrote:
 On Sun, May 30, 2010 at 4:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I read through that thread and couldn't find much discussion of
 alternative CRC implementations --- we spent all our time on arguing
 about whether we needed 64-bit CRC or not.
 
 SSE4.2 has a hardware CRC32 instruction, this might be interesting to
 use...
 Different polynom unfortunately...

Since only the WAL uses CRC, I guess the polynomial could be changed though. 
pg_upgrade for example shouldn't care.

RFC3385 compares different checksumming methods for use in iSCSI, and CRC32c 
(which uses the same polynomial as the SSE4.2 instruction) wins. Here's 
a link: http://www.faqs.org/rfcs/rfc3385.html

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
 WHERE should do it:

 SELECT * FROM secrets_view WHERE username = 'neighbor' AND  
 password::integer = 1234;
 ERROR:  invalid input syntax for integer: neighborssecretpassword

 Assuming that username = 'neighbor' is evaluated before the cast.

Fair enough, so we can't allow built-ins either, except perhaps in very
specific/limited situations.  Still, if we track that the above WHERE
and password::integer calls *should* be run as role X, while the view
should run as role Y, maybe we can at least identify the case where
we've ended up in a situation where we are going to expose unintended
data.  I don't know enough about the optimizer or the planner to have
any clue how we might teach them to actually avoid doing such, though I
certainly believe it could end up being a disaster on performance based
on comments from others who know better. :)

Thanks,

Stephen


signature.asc
Description: Digital signature

[HACKERS] warning message in standby

Hi,

When an error is found in the WAL streamed from the master, a warning
message is repeated without interval forever in the standby. This
consumes CPU load very much, and would interfere with read-only queries.
To fix this problem, we should add a sleep into emode_for_corrupt_record()
or somewhere? Or we should stop walreceiver and retry to read WAL from
pg_xlog or the archive?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Add current WAL end (as seen by walsender, ie, GetWriteRecPtr()

On Fri, Jun 4, 2010 at 7:17 AM, Tom Lane t...@postgresql.org wrote:
 Log Message:
 ---
 Add current WAL end (as seen by walsender, ie, GetWriteRecPtr() result)
 and current server clock time to SR data messages.  These are not currently
 used on the slave side but seem likely to be useful in future, and it'd be
 better not to change the SR protocol after release.  Per discussion.
 Also do some minor code review and cleanup on walsender.c, and improve the
 protocol documentation.

This commit changed walsender so that it doesn't call set_ps_display() if
update_process_title = off. On the other hand, walreceiver doesn't check
update_process_title. Though this check might not be required since it's
within set_ps_display(), we should do that for the sake of consistency?
I attached the patch.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


walrcv_update_process_title_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

2010-06-07 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160

Dean Rasheed wrote:
...
 So the current code in escape_yaml() is inadequate for producing valid
 YAML. I think it would have to also consider at least the following
 characters as special -  :  [  ]  {  }  ,  \  '
 |  *  . Technically, it would also need to trap empty strings,
 and strings with leading or trailing whitespace.

 Making escape_yaml() completely bulletproof with this approach would
 be quite difficult, and (IMO) not worth the effort
...

Doesn't seem like a lot of effort to me. You've already laid out most of 
the exceptions above, although they require a few tweaks.
The rules should be:

Requires quoting only if the first character:
   *  !  |'%  @  ` #

Same as above, but no quoting if the second character is safe:
 -  ?  :

Always requires quoting:
 :space  space#  aka  ': '  ' #'

Always requires quoting:
 ,  [  ]  {  }

Always require quoting:
 (leading space) (trailing space) (empty string)

See:
http://yaml.org/spec/1.2/spec.html section 5.3 and 7.3.3


- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201006070943
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkwM+wAACgkQvJuQZxSWSsgWZACcCgb0rDvA6ZVhHId/q568gBGo
sjgAoLY7HbkI7sRpO45vi0jSRJ2Fiytk
=v7T/
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

Greg Sabino Mullane g...@turnstep.com writes:
 The rules should be:

 Requires quoting only if the first character:
*  !  |'%  @  ` #

 Same as above, but no quoting if the second character is safe:
  -  ?  :

 Always requires quoting:
  :space  space#  aka  ': '  ' #'

 Always requires quoting:
  ,  [  ]  {  }

 Always require quoting:
  (leading space) (trailing space) (empty string)

Egad ... this is supposed to be an easily machine-generatable format?

If it's really as broken as the above suggests, I think we should
rip it out while we still can.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

2010-06-07 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Tom Lane wrote:
...
 Egad ... this is supposed to be an easily machine-generatable format?

 If it's really as broken as the above suggests, I think we should
 rip it out while we still can.

Heh ... not like you to shrink from a challenge. ;)

I don't think the above would be particularly hard to implement myself, 
but if it becomes a really big deal, we can certainly punt by simply 
quoting anything containing an indicator (the special characters above).
It will still be 100% valid YAML, just with some excess quoting for the 
very rare case when a value contains one of the special characters.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201006071035
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkwNA+4ACgkQvJuQZxSWSshSswCg81kd3FdYnQup1eLWGesm+vm+
VO8AoL1Fwil/vXfRdRHx4A4zZUTDbZuT
=oPDv
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

On Mon, Jun 7, 2010 at 10:37 AM, Greg Sabino Mullane g...@turnstep.com wrote:
 Tom Lane wrote:
 I don't think the above would be particularly hard to implement myself,
 but if it becomes a really big deal, we can certainly punt by simply
 quoting anything containing an indicator (the special characters above).
 It will still be 100% valid YAML, just with some excess quoting for the
 very rare case when a value contains one of the special characters.

Since you're the main advocate of this feature, I think you should
implement it rather than leaving it to Tom or I.

The reason why I was initially skeptical of adding a YAML output
format is that JSON is a subset of YAML.  Therefore, the JSON output
format ought to be perfectly sufficient for anyone using a YAML
parser.  If it's not, that's because their YAML processor is broken,
and they should get a new one, or because the YAML spec is defective.
The YAML format got voted in by consensus because people thought that
it would also make a nice alternative to the text format for human
readable output.  I don't believe that (it uses way too much vertical
space) but even if you accept the argument, the more we make the YAML
format look like the JSON format, the less water that argument holds.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

2010-06-07 Thread Andrew Dunstan




Robert Haas wrote:

On Mon, Jun 7, 2010 at 10:37 AM, Greg Sabino Mullane g...@turnstep.com wrote:
  

Tom Lane wrote:
I don't think the above would be particularly hard to implement myself,
but if it becomes a really big deal, we can certainly punt by simply
quoting anything containing an indicator (the special characters above).
It will still be 100% valid YAML, just with some excess quoting for the
very rare case when a value contains one of the special characters.



Since you're the main advocate of this feature, I think you should
implement it rather than leaving it to Tom or I.
  


Or anyone else :-)


The reason why I was initially skeptical of adding a YAML output
format is that JSON is a subset of YAML.  Therefore, the JSON output
format ought to be perfectly sufficient for anyone using a YAML
parser.  
  


There is some debate on this point, IIRC.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

Greg Sabino Mullane g...@turnstep.com writes:
 I don't think the above would be particularly hard to implement myself, 
 but if it becomes a really big deal, we can certainly punt by simply 
 quoting anything containing an indicator (the special characters above).

I would go with that.  The quoting rules you proposed previously seem
way too complicated --- meaning potentially buggy, and even if they're
not buggy, the behavior would seem unpredictable to most users.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

2010-06-07 Thread Dean Rasheed

On 7 June 2010 15:56, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Sabino Mullane g...@turnstep.com writes:
 I don't think the above would be particularly hard to implement myself,
 but if it becomes a really big deal, we can certainly punt by simply
 quoting anything containing an indicator (the special characters above).

 I would go with that.  The quoting rules you proposed previously seem
 way too complicated --- meaning potentially buggy, and even if they're
 not buggy, the behavior would seem unpredictable to most users.


Well actually it's not just everything containing a special character,
it's also anything with leading or trailing whitespace, and empty
strings (not sure that can ever happen in practice).

It's because of the potential for bugs in this area, that I'd propose
just quoting everything (except numeric values) as in my original
patch.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SR slaves and .pgpass

Fujii Masao masao.fu...@gmail.com writes:
 On Mon, Jun 7, 2010 at 5:42 AM, Andrew Dunstan and...@dunslane.net wrote:
 I tried this with a database name of replication in the .pgpass file,
 which matches what we need to use in pg_hba.conf, but it failed miserably,
 and only worked when I used a wildcard for the database name in the .pgpass
 file.
 
 If this is expected it needs to be documented more clearly; if not, it's a
 bug.

 Yep, this is expected, so we need to improve the doc.

Why don't we improve the code, instead?  In particular make
libpqrcv_connect() do

-   snprintf(conninfo_repl, sizeof(conninfo_repl), %s replication=true, 
conninfo);
+   snprintf(conninfo_repl, sizeof(conninfo_repl), %s database=replication 
replication=true, conninfo);

I don't think it's unlikely that someone would try to enter a
replication-specific password into ~/.pgpass.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pgstatindex still throws ERROR: value 3220078592 is out of range for type integer

2010-06-07 Thread Dave Cramer

I noted on line 169 that max_avail is still an int ? Where else would
it be having problems ?

Dave

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SR slaves and .pgpass

2010-06-07 Thread Andrew Dunstan




Tom Lane wrote:

Why don't we improve the code, instead?  In particular make
libpqrcv_connect() do

-   snprintf(conninfo_repl, sizeof(conninfo_repl), %s replication=true, 
conninfo);
+   snprintf(conninfo_repl, sizeof(conninfo_repl), %s database=replication 
replication=true, conninfo);

I don't think it's unlikely that someone would try to enter a
replication-specific password into ~/.pgpass.


  


+1. It's highly likely - that's how we got here in the first place. It 
seems to me like a perfectly reasonable thing to do.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] psql: ON_ERROR_STOP command-line flag for scripts

2010-06-07 Thread Mike Toews

Hey hackers,

I routinely run SQL scripts using pgsql -f myscript.sql. If there is
an error mid-way in a large script file, due to the default
ON_ERROR_STOP=0, I see repeated ERROR:  current transaction is
aborted, commands ignored until end of transaction block until the
end of the script. Often the error is buried at the top of the output
and the following commands are meaningless or misleading messages with
a last ROLLBACK at the bottom. This is particular relevant to new
PostGIS users running postgis.sql that missed a step somewhere.

I know I can either embed a non-SQL command in the SQL file or add
--set ON_ERROR_STOP =1 to the command arguments, however I'd like a
simpler command interface (just like there is -q / --quiet for QUIET).
Could I suggest a new command-line option -r / --on-error-stop for
psql? I have limited programming with C, should I bother submitting a
patch, or is someone else eager?

Thanks for your consideration.

-Mike

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

2010-06-07 Thread Florian Weimer

* Tom Lane:

 Egad ... this is supposed to be an easily machine-generatable format?

Perhaps you could surround all strings with  in the generator, and
escape all potentially special characters (which seems to include some
whitespace even in quoted strings, unfortunately)?

It has been claimed before that YAML is a superset of JSON, so why
can't the YAML folks use the existing JSON output instead?

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql: ON_ERROR_STOP command-line flag for scripts

* Mike Toews (mwto...@gmail.com) wrote:
 I know I can either embed a non-SQL command in the SQL file or add
 --set ON_ERROR_STOP =1 to the command arguments, however I'd like a
 simpler command interface (just like there is -q / --quiet for QUIET).
 Could I suggest a new command-line option -r / --on-error-stop for
 psql? I have limited programming with C, should I bother submitting a
 patch, or is someone else eager?

Honestly, I'd love to see something like --script that does a number
of things:

Forcibly set (regardless of .psqlrc, in particular):
--no-psqlrc
--echo-all
--single-transaction

ON_ERROR_ROLLBACK=off
ON_ERROR_STOP=on
AUTOCOMMIT=off

Assume -f - if no -f passed in (provided this actually works as
expected..  I'm not sure it does, I remember there being some discussion
about this recently).

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

2010-06-07 Thread Greg Smith


Florian Weimer wrote:

It has been claimed before that YAML is a superset of JSON, so why
can't the YAML folks use the existing JSON output instead?
  


Because JSON just crosses the line where it feels like there's so much 
markup that people expect a tool is necessary to read it, which has 
always been the issue with XML too--bad human readability.  I was on the 
fence about YAML until I used it for a client issue over the weekend.  I 
was able to hack together a quick tool to work on the issue that parsed 
enough YAML *without using an external library* well enough for my 
purposes in an hour, one that was still far more robust than a similar 
hack trying to read plain old text format for EXPLAIN.  And the client 
was able to follow what was going on as I passed YAML output back and 
forth with them.  Just having every field labeled clearly cut off all 
the usual which of these is the startup cost again? questions I'm used 
to getting.


The complaints about YAML taking up too much vertical space are 
understandable, but completely opposite of what I care about.  I can 
e-mail a customer a YAML plan and it will survive to the other side and 
even in a reply back to me.  Whereas any non-trivial text format one is 
guaranteed to utterly destroyed by line wrapping along the way.


I think this thread could use a fresh example to remind anyone who 
hasn't played with the curent YAML format what it looks like.  Here's 
one from a query against the Dell Store 2 database:


EXPLAIN SELECT * FROM customers WHERE customerid1000 ORDER BY zip;
QUERY PLAN
--
Sort  (cost=4449.30..4496.80 rows=19000 width=268)
  Sort Key: zip
  -  Seq Scan on customers  (cost=0.00..726.00 rows=19000 width=268)
Filter: (customerid  1000)

EXPLAIN (FORMAT YAML) SELECT * FROM customers WHERE customerid1000 
ORDER BY zip;
QUERY PLAN 
-

- Plan:+
Node Type: Sort+
Startup Cost: 4449.30  +
Total Cost: 4496.80+
Plan Rows: 19000   +
Plan Width: 268+
Sort Key:  +
  - zip+
Plans: +
  - Node Type: Seq Scan+
Parent Relationship: Outer +
Relation Name: customers   +
Alias: customers   +
Startup Cost: 0.00 +
Total Cost: 726.00 +
Plan Rows: 19000   +
Plan Width: 268+
Filter: (customerid  1000)

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

Greg Smith g...@2ndquadrant.com writes:
 The complaints about YAML taking up too much vertical space are 
 understandable, but completely opposite of what I care about.  I can 
 e-mail a customer a YAML plan and it will survive to the other side and 
 even in a reply back to me.  Whereas any non-trivial text format one is 
 guaranteed to utterly destroyed by line wrapping along the way.

 I think this thread could use a fresh example to remind anyone who 
 hasn't played with the curent YAML format what it looks like.

So?  This doesn't look amazingly unlike the current JSON output,
and to the extent that we have to add more quoting to it, it's
going to look even more like the JSON output.

Given the lack of any field separators other than newlines, I'm also
finding myself extremely doubtful about the claim that it survives
line-wrapping mutilations well.  For instance this bit:

   - Node Type: Seq Scan
 Parent Relationship: Outer

doesn't appear to have anything but whitespace to distinguish it from

   - Node Type: Seq Scan Parent
 Relationship: Outer

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nvarchar notation accepted?

2010-06-07 Thread Jaime Casanova

On Mon, Jun 7, 2010 at 2:23 AM, Peter Eisentraut pete...@gmx.net wrote:

 The N'' syntax is in the SQL standard,


I didn't know that, do you know what paragraph is it? i can't find it

-- 
Jaime Casanova www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nvarchar notation accepted?

2010-06-07 Thread Peter Eisentraut

On mån, 2010-06-07 at 12:56 -0500, Jaime Casanova wrote:
 On Mon, Jun 7, 2010 at 2:23 AM, Peter Eisentraut pete...@gmx.net wrote:
 
  The N'' syntax is in the SQL standard,
 
 
 I didn't know that, do you know what paragraph is it? i can't find it

Look for national character string literal.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

2010-06-07 Thread Josh Berkus


 It's because of the potential for bugs in this area, that I'd propose
 just quoting everything (except numeric values) as in my original
 patch.

I don't see a problem with this.

I supported YAML output because I find it easier to read and copypaste
than the other outputs.  This is still the case even with quoting.  And
it's not exactly a hugely intrusive patch.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Functional dependencies and GROUP BY

2010-06-07 Thread Peter Eisentraut

I have developed a patch that partially implements the functional
dependency feature that allows some columns to be omitted from the
GROUP BY clause if it can be shown that the columns are functionally
dependent on the columns in the group by clause and therefore guaranteed
to be unique per group.  The full functional dependency deduction rules
are pretty big and arcane, so I concentrated on getting a useful subset
working.  In particular:

When grouping by primary key, the other columns can be omitted, e.g.,

CREATE TABLE tab1 (a int PRIMARY KEY, b int);

SELECT a, b FROM tab1 GROUP BY a;

This is frequently requested by MySQL converts (and possibly others).

Also, when a column is compared with a constant, it can appear
ungrouped:

SELECT x, y FROM tab2 WHERE y = 5 GROUP BY x;

For lack of a better idea, I have made it so that merge-joinable
operators qualify as equality operators.  Better ideas welcome.

Other rules could be added over time (but I'm current not planning to
work on that myself).

At this point, this patch could use some review and testing with unusual
queries that break my implementation. ;-)

diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index d0c41ce..e40cc4c 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -886,10 +886,7 @@ SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
 In this example, the columns literalproduct_id/literal,
 literalp.name/literal, and literalp.price/literal must be
 in the literalGROUP BY/ clause since they are referenced in
-the query select list.  (Depending on how the products
-table is set up, name and price might be fully dependent on the
-product ID, so the additional groupings could theoretically be
-unnecessary, though this is not implemented.)  The column
+the query select list (but see below).  The column
 literals.units/ does not have to be in the literalGROUP
 BY/ list since it is only used in an aggregate expression
 (literalsum(...)/literal), which represents the sales
@@ -898,6 +895,18 @@ SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
/para
 
para
+If the products table is set up so that,
+say, literalproduct_id/literal is the primary key or a
+not-null unique constraint, then it would be enough to group
+by literalproduct_id/literal in the above example, since name
+and price would be firsttermfunctionally
+dependent/firsttermindextermprimaryfunctional
+dependency/primary/indexterm on the product ID, and so there
+would be no ambiguity about which name and price value to return
+for each product ID group.
+   /para
+
+   para
 In strict SQL, literalGROUP BY/ can only group by columns of
 the source table but productnamePostgreSQL/productname extends
 this to also allow literalGROUP BY/ to group by columns in the
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index a4d017f..d901390 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -520,9 +520,17 @@ GROUP BY replaceable class=parameterexpression/replaceable [, ...]
 produces a single value computed across all the selected rows).
 When literalGROUP BY/literal is present, it is not valid for
 the commandSELECT/command list expressions to refer to
-ungrouped columns except within aggregate functions, since there
-would be more than one possible value to return for an ungrouped
-column.
+ungrouped columns except within aggregate functions or if the
+ungrouped column is functionally dependent on the grouped columns,
+since there would otherwise be more than one possible value to
+return for an ungrouped column.  A functional dependency exists if
+the grouped columns (or a subset thereof) are the primary key or a
+not-null unique constraint of the table containing the ungrouped
+column.  A functional dependency also exists if the ungrouped
+column is constrained by the literalWHERE/literal clause to a
+constant value (for example, by equality comparison with a
+constant).  Further rules for determining functional dependencies
+might be added in the future.
/para
   /refsect2
 
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 0a69bde..bfdd7ef 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -14,6 +14,8 @@
  */
 #include postgres.h
 
+#include access/heapam.h
+#include catalog/pg_index.h
 #include nodes/makefuncs.h
 #include nodes/nodeFuncs.h
 #include optimizer/tlist.h
@@ -24,20 +26,23 @@
 #include rewrite/rewriteManip.h
 #include utils/builtins.h
 #include utils/lsyscache.h
+#include utils/syscache.h
 
 
 typedef struct
 {
 	ParseState *pstate;
+	Query	   *qry;
 	List	   *groupClauses;
 	bool		have_non_var_grouping;
 	int			sublevels_up;
 } check_ungrouped_columns_context;
 
-static void check_ungrouped_columns(Node *node, ParseState *pstate,

Re: [HACKERS] [BUGS] Invalid YAML output from EXPLAIN

2010-06-07 Thread Greg Smith


Tom Lane wrote:

This doesn't look amazingly unlike the current JSON output,
and to the extent that we have to add more quoting to it, it's
going to look even more like the JSON output.
  


I don't know about that; here's the JSON one:

EXPLAIN (FORMAT JSON) SELECT * FROM customers WHERE customerid1000 
ORDER BY zip;
   QUERY PLAN
---

[+
  {  +
Plan: {+
  Node Type: Sort,   +
  Startup Cost: 4449.30,   +
  Total Cost: 4496.80, +
  Plan Rows: 19000,+
  Plan Width: 268, +
  Sort Key: [zip],   +
  Plans: [ +
{+
  Node Type: Seq Scan,   +
  Parent Relationship: Outer,+
  Relation Name: customers,  +
  Alias: customers,  +
  Startup Cost: 0.00,  +
  Total Cost: 726.00,  +
  Plan Rows: 19000,+
  Plan Width: 268, +
  Filter: (customerid  1000)+
}+
  ]  +
}+
  }  +
]

From the perspective of how that's less useful as a human form of 
output, it's longer, wider, and has redundant punctuation that gets in 
the way.


I think that YAML quoting will need to respect one of the special cases 
to keep from ruining its readability:  Requires quoting only if the 
first character for  will make its current format look terrible if 
that rule is applied to the whole line instead.  That sounds like a 
necessary special case to include:  don't quote any quote characters 
that appear unless they're the first character on the line.  Everything 
else could switch back to really aggressive quoting in every spot and 
that wouldn't hurt the readability of the format very much IMHO.



Given the lack of any field separators other than newlines, I'm also
finding myself extremely doubtful about the claim that it survives
line-wrapping mutilations well.


All I was claiming there is that the output is dramatically less wide 
than the standard text format of the same plan, and therefore far less 
likely to get nailed by a mail client that wraps at normal line widths.  
Agreed that once wrapping does occur, it has serious problems too.


Here are the stats for this plan, leaving off the QUERY PLAN header from 
each:


TEXT:  4 vertical, 69 horizontal
YAML:  18 vertical, 36 horizontal
JSON:  25 vertical, 43 horizontal
XML[1]:  27 vertical, 60 horizontal

Quote the TEXT line with   or get a plan with one more line of 
intendation, and you're likely to get wrapped badly at the 72 character 
line limit some clients use.  Quite a bit more headroom before the YAML 
format will wrap like that; JSON is in the middle.


I now see plenty of use for YAML when exchanging plans over e-mail, and 
it's a bonus that should survive that format to be parseable on the 
other side.   JSON and XML are certainly the preferred way to feed plans 
into analysis tools. unambiguously.


[1] Might as well make this a complete example:

explain xmlns=http://www.postgresql.org/2009/explain;  +
  Query +
Plan+
  Node-TypeSort/Node-Type +
  Startup-Cost4449.30/Startup-Cost+
  Total-Cost4496.80/Total-Cost+
  Plan-Rows19000/Plan-Rows+
  Plan-Width268/Plan-Width+
  Sort-Key  +
Itemzip/Item  +
  /Sort-Key +
  Plans +
Plan+
  Node-TypeSeq Scan/Node-Type +
  Parent-RelationshipOuter/Parent-Relationship+
  Relation-Namecustomers/Relation-Name+
  Aliascustomers/Alias+
  Startup-Cost0.00/Startup-Cost   +
  Total-Cost726.00/Total-Cost +
  Plan-Rows19000/Plan-Rows+
  Plan-Width268/Plan-Width+
  Filter(customerid gt; 1000)/Filter +
/Plan   +
  /Plans+
/Plan   +
  /Query+
/explain

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing

Re: [HACKERS] Re: [COMMITTERS] pgsql: Add current WAL end (as seen by walsender, ie, GetWriteRecPtr()

Fujii Masao masao.fu...@gmail.com writes:
 This commit changed walsender so that it doesn't call set_ps_display() if
 update_process_title = off.

Yeah.  The idea is to not waste time on a useless snprintf() call when the
GUC is turned off.

 On the other hand, walreceiver doesn't check
 update_process_title. Though this check might not be required since it's
 within set_ps_display(), we should do that for the sake of consistency?

Seems reasonable to make the same minor optimization on that side.
Will commit.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-06-07 Thread Yeb Havinga


Florian Pflug wrote:

On Jun 7, 2010, at 12:45 , Andres Freund wrote:
  

On Monday 07 June 2010 12:37:13 Pierre C wrote:


On Sunday 30 May 2010 18:29:31 Greg Stark wrote:


On Sun, May 30, 2010 at 4:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  

I read through that thread and couldn't find much discussion of
alternative CRC implementations --- we spent all our time on arguing
about whether we needed 64-bit CRC or not.


SSE4.2 has a hardware CRC32 instruction, this might be interesting to
use...
  

Different polynom unfortunately...



Since only the WAL uses CRC, I guess the polynomial could be changed though. 
pg_upgrade for example shouldn't care.

RFC3385 compares different checksumming methods for use in iSCSI, and CRC32c (which uses the same polynomial as the SSE4.2 instruction) wins. Here's 
a link: http://www.faqs.org/rfcs/rfc3385.html
  
The linux kernel also uses it when it's availabe, see e.g. 
http://tomoyo.sourceforge.jp/cgi-bin/lxr/source/arch/x86/crypto/crc32c-intel.c


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Working with PostgreSQL enums in C code

2010-06-07 Thread Joseph Adams

On Fri, May 28, 2010 at 9:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, May 28, 2010 at 12:07 AM, Joseph Adams
 joeyadams3.14...@gmail.com wrote:
 I learned that to return an enum value from C, one needs to return the
 OID of the right row of the pg_enum table.  I eventually managed to
 write the code below, which is mostly based on the enum_in function in
 src/backend/utils/adt/enum.c .

 PG_RETURN macros shouldn't do any nontrivial processing (see the
 existing ones for references).

 Yeah, that was my first reaction too.  If we don't already have one,
 it would be appropriate to provide a lookup enum value function
 (functionally about the same as enum_in, but designed to be called
 conveniently from C).  Then, if you needed to work from a textual
 enum label, you'd call that function and then PG_RETURN_OID.

Here is the function I wrote to look up enum values:

Oid enum_label_to_oid(const char *typname, const char *label)
{
Oid enumtypoid;
HeapTuple   tup;
Oid ret;

enumtypoid = TypenameGetTypid(typname);
Assert(OidIsValid(enumtypoid));

tup = SearchSysCache2(ENUMTYPOIDNAME,
  
ObjectIdGetDatum(enumtypoid),
  
CStringGetDatum(label));
Assert(HeapTupleIsValid(tup));

ret = HeapTupleGetOid(tup);

ReleaseSysCache(tup);

return ret;
}

If this were added to PostgreSQL proper, what source/header files
would it make sense to put it in?  enum.c/builtins.h ?

 However, for a built-in enum type, I agree with Robert's solution of
 just #define-ing fixed OIDs for the values of the type.

I don't know if the JSON datatype will be a contrib module or built-in
yet, but if it were contrib, would it still be better to use fixed
OIDs anyway?  One issue with setting this precedent is that new
contrib modules using enums wouldn't be compatible with older versions
of PostgreSQL.  One idea might be to insert extend CREATE TYPE name AS
ENUM so OIDs can be specified explicitly, but then that could lead to
OID clashes.  That would be a really messy problem for users.

By the way, just curious: why can't the internal representation of an
enum just be an INT starting from 0 by default, like in C?  That would
make a heck of a lot more sense, in my opinion.  It might also allow
users to do things like this in the future:

CREATE TYPE number AS ENUM ('one' = 1, 'two' = 2, 'three' = 3);


Joey Adams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Functional dependencies and GROUP BY

2010-06-07 Thread Greg Stark

On Mon, Jun 7, 2010 at 7:33 PM, Peter Eisentraut pete...@gmx.net wrote:
 I have developed a patch that partially implements the functional
 dependency feature

Nice! :)

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Working with PostgreSQL enums in C code

2010-06-07 Thread Alvaro Herrera

Excerpts from Joseph Adams's message of lun jun 07 17:16:12 -0400 2010:

  However, for a built-in enum type, I agree with Robert's solution of
  just #define-ing fixed OIDs for the values of the type.
 
 I don't know if the JSON datatype will be a contrib module or built-in
 yet, but if it were contrib, would it still be better to use fixed
 OIDs anyway?

If it were contrib, fixed OIDs wouldn't cut it precisely for that
reason.  (A contrib module is not built-in for these purposes.)

 One issue with setting this precedent is that new
 contrib modules using enums wouldn't be compatible with older versions
 of PostgreSQL.  One idea might be to insert extend CREATE TYPE name AS
 ENUM so OIDs can be specified explicitly, but then that could lead to
 OID clashes.  That would be a really messy problem for users.

Yeah.  We've just defined an interface for pg_migrator-only usage,
allowing it to define the OID values of ENUMs; it wasn't considered a
good idea to expose the details to the user.

 By the way, just curious: why can't the internal representation of an
 enum just be an INT starting from 0 by default, like in C?  That would
 make a heck of a lot more sense, in my opinion.  It might also allow
 users to do things like this in the future:
 
 CREATE TYPE number AS ENUM ('one' = 1, 'two' = 2, 'three' = 3);

The problem is that the output function needs to be able to figure out
the value starting with only the datum value.  If it had only the 1
it couldn't know what enum it'd correspond to.  The other alternative
would have been to make enums 64 bits wide, carrying the enum OID in 32
bits and the value in the other 32.  This was dismissed as too wasteful.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 ... my perspective is that it would be A Good Thing if it could
 just be turned on when needed.  If you have recurring bug that can
 be arranged, but in those cases you have other options; so I'm
 assuming you want this kept because it is primarily of forensic
 value after a non-repeatable bug has munged something?
 
Yeah, that's exactly the problem.  When you realize you need it,
it's too late.

 The best thought I've had so far
 is that if someone kept WAL files long enough the evidence might be
 in there somewhere

Hm, that is an excellent point.  The WAL trace would actually be a lot
superior in terms of being able to figure out what went wrong.  But
I don't quite see how we tell people either keep xmin or keep your
old WAL.  Also, for production sites the amount of WAL you'd have to
hang onto seems a bit daunting.  Other problems are the cost of shipping
it to a developer, and the impracticality of sanitizing private data in
it before you show it to somebody.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Working with PostgreSQL enums in C code

On Mon, Jun 7, 2010 at 5:16 PM, Joseph Adams joeyadams3.14...@gmail.com wrote:
 I don't know if the JSON datatype will be a contrib module or built-in
 yet, but if it were contrib, would it still be better to use fixed
 OIDs anyway?

Part of the point is that EXPLAIN (FORMAT JSON) should return json, so
this needs to be built-in.  Otherwise, that won't work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-07 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 
 The best thought I've had so far is that if someone kept WAL
 files long enough the evidence might be in there somewhere
 
 Hm, that is an excellent point.  The WAL trace would actually be a
 lot superior in terms of being able to figure out what went wrong.
 But I don't quite see how we tell people either keep xmin or keep
 your old WAL.  Also, for production sites the amount of WAL you'd
 have to hang onto seems a bit daunting.
 
Any thoughts on how far back the WAL would need to go to deal with
the issues where such information has been useful?  (For example, we
always have at least two weeks worth, but I don't know if that's a
useful range or not.)
 
 Other problems are the cost of shipping it to a developer, and the
 impracticality of sanitizing private data in it before you show it
 to somebody.
 
Yeah, this wouldn't be a practical answer to the need unless
PostgreSQL shipped with a tool which could scan WAL and extract the
relevant information (probably under direction of someone from the
list or a private support organization).  Is the required
information predictable enough to make developing such a tool a
tractable problem?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-07 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 
 If we were actually going in this direction we'd want to write a
 much better WAL-text-dump tool than we have, and then in principle
 somebody could sanitize the text output before shipping it off.
 
I wouldn't think this would be practical unless there was a way to
scan the WAL files and dump only the bits related to the affected
pages, and probably with at least an option (possibly default) to
just mention the data type and length, rather than showing the
actual values -- how often are the values relevant, anyway?  (Not
rhetorical; I really don't know.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Hm, that is an excellent point.  The WAL trace would actually be a
 lot superior in terms of being able to figure out what went wrong.
 But I don't quite see how we tell people either keep xmin or keep
 your old WAL.  Also, for production sites the amount of WAL you'd
 have to hang onto seems a bit daunting.
 
 Any thoughts on how far back the WAL would need to go to deal with
 the issues where such information has been useful?  (For example, we
 always have at least two weeks worth, but I don't know if that's a
 useful range or not.)

Well, it's a how long does it take you to notice data corruption
kind of issue.  The most recent case I can think of where xmin was
helpful was in trying to sort out a problem with an index being
inconsistent with the heap, which manifested as wrong query answers
for the user.  I don't know how long it took him to recognize and
report the problem.  (We never did locate the bug-if-any, IIRC...
it would have been much more helpful to have the WAL trace.  xmin
did let me rule out some theories, though.)
 
 Other problems are the cost of shipping it to a developer, and the
 impracticality of sanitizing private data in it before you show it
 to somebody.
 
 Yeah, this wouldn't be a practical answer to the need unless
 PostgreSQL shipped with a tool which could scan WAL and extract the
 relevant information (probably under direction of someone from the
 list or a private support organization).  Is the required
 information predictable enough to make developing such a tool a
 tractable problem?

Hard to tell.  If we were actually going in this direction we'd
want to write a much better WAL-text-dump tool than we have, and then
in principle somebody could sanitize the text output before shipping
it off.  But going through a large volume of data that way could be
pretty impractical.  Also, we (or at least I) have nearly zip experience
with trying to debug problems by examining WAL, so it's not real clear
to me which details might be important.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] How to get permission to debug postgres?

2010-06-07 Thread Randy Solomonson

I have a system that uses C methods as functions in a Postgres
database. When I try to run gdb with my user account I get this:
ptrace: Operation not permitted

It looks like a permission thing. It seemed to work when I ran gdb as
the postgres user. I would like to continue running postgres as the
postgres user and not have to su postgres when I want to debug. Can
I somehow grant my user account ptrace permission for the process in
question? Adding myself to the postgres group didn't seem to help.

I found this on the ptrace man page:

EPERM  The specified process cannot be traced.  This could  be  because
  the  parent has insufficient privileges (the required capability
  is CAP_SYS_PTRACE); non-root processes  cannot  trace  processes
  that  they  cannot  send  signals  to or those running set-user-
  ID/set-group-ID programs, for obvious  reasons.   Alternatively,
  the process may already be being traced, or be init(8) (PID 1).

So do I just need to give CAP_SYS_TRACE privileges to my user?  If so, how?

Thank you much.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multibyte charater set in levenshtein function

2010-06-07 Thread Alexander Korotkov

Hello Hackers!

I have extended my patch by introducing levenshtein_less_equal function.
This function have additional argument max_d and stops calculating when
distance exceeds max_d. With low values of max_d function works much faster
than original one.

The example of original levenshtein function usage:

test=# select word, levenshtein(word, 'consistent') as dist from words where
levenshtein(word, 'consistent') = 2 order by dist;
word | dist
-+--
  consistent  |0
 insistent   |2
 consistency |2
 coexistent  |2
 consistence |2
(5 rows)

test=# explain analyze select word, levenshtein(word, 'consistent') as dist
from words where levenshtein(word, 'consistent') = 2 order by dist;
  QUERY PLAN

---
 Sort  (cost=2779.13..2830.38 rows=20502 width=8) (actual
time=203.652..203.658 rows=5 loops=1)
   Sort Key: (levenshtein(word, 'consistent'::text))
   Sort Method:  quicksort  Memory: 25kB
   -  Seq Scan on words  (cost=0.00..1310.83 rows=20502 width=8) (actual
time=19.019..203.601 rows=5 loops=1)
 Filter: (levenshtein(word, 'consistent'::text) = 2)
 Total runtime: 203.723 ms
(6 rows)

Example of levenshtein_less_equal usage in this case:

test=# select word, levenshtein_less_equal(word, 'consistent', 2) as dist
from words where levenshtein_less_equal(word, 'consistent', 2) = 2 order by
dist;
word | dist
-+--
 consistent  |0
 insistent   |2
 consistency |2
 coexistent  |2
 consistence |2

test=# explain analyze select word, levenshtein_less_equal(word,
'consistent', 2) as dist from words where levenshtein_less_equal(word,
'consistent', 2) = 2 order by dist;
 QUERY PLAN

-
 Sort  (cost=2779.13..2830.38 rows=20502 width=8) (actual
time=42.198..42.203 rows=5 loops=1)
   Sort Key: (levenshtein_less_equal(word, 'consistent'::text, 2))
   Sort Method:  quicksort  Memory: 25kB
   -  Seq Scan on words  (cost=0.00..1310.83 rows=20502 width=8) (actual
time=5.391..42.143 rows=5 loops=1)
 Filter: (levenshtein_less_equal(word, 'consistent'::text, 2) = 2)
 Total runtime: 42.292 ms
(6 rows)

In the example above levenshtein_less_equal works about 5 times faster.

With best regards,
Alexander Korotkov.


fuzzystrmatch-0.3.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Parameters of GiST indexes

2010-06-07 Thread Alexander Korotkov

Hi hackers,

I found that some parameters of GiST implementation are builin in the code.
For example, following can be found in the backend/utils/adt/tsgistidx.c:

#define SIGLENINT  31/* 121 = key will toast, so it will not
work
 * !!! */

#define SIGLEN( sizeof(int4) * SIGLENINT )
#define SIGLENBIT (SIGLEN * BITS_PER_BYTE)

I think that such parameters don't have optimal value for all the cases; and
it would be a great option to let user define such parameters for particular
index.
For example, following syntax can be used:

CREATE INDEX name ON table USING gist(column) WITH (SIGLENINT=63);

With best regards,
Korotkov Alexander.

[HACKERS] Using multidimensional indexes in ordinal queries

2010-06-07 Thread Alexander Korotkov

Hello hackers,

I would like to share some my thoughts about usage of multidimensional
indexes for queries which deal with ordinal unidimensional data types. I
think that gist indexes (especially with knngist) can produce great benefit
for complex multi-criterion queries.
Let's consider come example. I use postgresql-9.0beta1 with knngist patch.
Also I have created simple patch that allows to use knngist for ordinal
sorting in cube extension (patch is attached). The * operator was
introduced in my patch. The first operand is the cube and the second operand
is number n. If n = 2*k then the ascending ordering by k-dimension occurs.
If n = 2*k + 1 then descending ordering by k-dimension occurs. Now this
operator have a limitation and works only with nonnegative coordinate
values.
Let's create table with 3 float-point columns and fill it with 10M rows;

create table test (id serial primary key, v1 double precision, v2 double
precision, v3 double precision);
insert into test (v1,v2,v3) (select random()*1000, random()*1000,
random()*1000 from generate_series(1,1000,1));

Now, let's create 3 separate btree indexes and one gist cube index.

create index test_v1_idx on test(v1);
create index test_v2_idx on test(v2);
create index test_v3_idx on test(v3);
create index test_cube_idx on test using gist(cube(ARRAY[v1,v2,v3]));

Let's consider some complex query with filtering, ordering and limit.

test=# select * from test where v1 between 480 and 500 and v2 between 480
and 500 order by v3 limit 10;
id|v1|v2|v3
--+--+--+---
 12283631 | 485.982828773558 | 496.795611456037 | 0.213871244341135
  4936086 | 497.239370364696 | 491.878624074161 |  1.26481195911765
  8963067 | 484.963194001466 | 497.094289399683 |  1.30057940259576
 12435440 | 498.670902103186 | 498.667187988758 |  1.33110675960779
 11667415 | 494.398592971265 | 497.440234292299 |  1.44533207640052
  8530558 |  482.85893118009 | 496.267838869244 |  1.48530444130301
  4004942 | 483.679085504264 | 489.547223784029 |  1.57393841072917
 14897796 |  491.37338064611 | 487.47524273 |  1.81775307282805
  4105759 | 489.506138022989 |  486.91446846351 |  1.94038823246956
 12895656 | 499.508572742343 | 487.065799534321 |  2.34963605180383
(10 rows)

test=# explain analyze select * from test where v1 between 480 and 500 and
v2 between 480 and 500 order by v3 limit 10;

 QUERY PLAN

---
 Limit  (cost=22786.73..22786.75 rows=10 width=28) (actual
time=3242.135..3242.162 rows=10 loops=1)
   -  Sort  (cost=22786.73..22797.59 rows=4345 width=28) (actual
time=3242.131..3242.144 rows=10 loops=1)
 Sort Key: v3
 Sort Method:  top-N heapsort  Memory: 25kB
 -  Bitmap Heap Scan on test  (cost=8755.91..22692.83 rows=4345
width=28) (actual time=1281.030..3234.934 rows=4027 loops=1)
   Recheck Cond: ((v1 = 480::double precision) AND (v1 =
500::double precision) AND (v2 = 480::double precision) AND (v2 =
500::double precision))
   -  BitmapAnd  (cost=8755.91..8755.91 rows=4345 width=0)
(actual time=1280.783..1280.783 rows=0 loops=1)
 -  Bitmap Index Scan on test_v1_idx
 (cost=0.00..4243.12 rows=202177 width=0) (actual time=644.702..644.702
rows=200715 loops=1)
   Index Cond: ((v1 = 480::double precision) AND
(v1 = 500::double precision))
 -  Bitmap Index Scan on test_v2_idx
 (cost=0.00..4510.37 rows=214902 width=0) (actual time=630.085..630.085
rows=200200 loops=1)
   Index Cond: ((v2 = 480::double precision) AND
(v2 = 500::double precision))
 Total runtime: 3242.253 ms
(12 rows)

This query can be rewritten in order to let planner use gist cube index.

test=# select * from test where cube(array[v1,v2,v3]) @
cube(array[480,480,'-Infinity'::float],array[500,500,'+Infinity'::float])
order by cube(array[v1,v2,v3]) * 4 limit 10;
id|v1|v2|v3
--+--+--+---
 12283631 | 485.982828773558 | 496.795611456037 | 0.213871244341135
  4936086 | 497.239370364696 | 491.878624074161 |  1.26481195911765
  8963067 | 484.963194001466 | 497.094289399683 |  1.30057940259576
 12435440 | 498.670902103186 | 498.667187988758 |  1.33110675960779
 11667415 | 494.398592971265 | 497.440234292299 |  1.44533207640052
  8530558 |  482.85893118009 | 496.267838869244 |  1.48530444130301
  4004942 | 483.679085504264 | 489.547223784029 |  1.57393841072917
 14897796 |  491.37338064611 | 487.47524273 |  1.81775307282805
  4105759 | 489.506138022989 |  486.91446846351 |  1.94038823246956
 12895656 | 499.508572742343 | 487.065799534321 |  2.34963605180383
(10 rows)

test=# explain analyze select * from test where

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-07 Thread Greg Smith


Tom Lane wrote:

If we were actually going in this direction we'd
want to write a much better WAL-text-dump tool than we have, and then
in principle somebody could sanitize the text output before shipping
it off.  But going through a large volume of data that way could be
pretty impractical.  Also, we (or at least I) have nearly zip experience
with trying to debug problems by examining WAL, so it's not real clear
to me which details might be important.
  


There's another interesting thing about moving in this direction too.  
Systems that have WAL archiving setup that run into a problem could end 
up being a much richer source for historical analysis of how the system 
got into the bad state than is available right now.  Typically those can 
have longer histories available than you'll find on a primary that's 
recycling segments all the time.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

(2010/06/07 20:53), Heikki Linnakangas wrote:
 On 07/06/10 14:06, Stephen Frost wrote:
 * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
 The big difference is what information can be obtained, not how fast it
 can be obtained.

 Actually, I disagree. Time required to acquire the data does matter.
 
 Depends on the magnitude, of course. If it takes 1 year per row, that's 
 probably acceptable. If it takes 1 second, that's extremely slow 
 compared to normal queries, but most likely still disastreous from a 
 security point of view.
 
FYI, the classic also mentioned about bandwidth of covert channel,
although it was already obsoleted.

See the page.80 of:
  http://csrc.nist.gov/publications/history/dod85.pdf

It said 1bit/sec are acceptable on DoD in 25 years ago.

 Imagine a table that holds username/passwords for users. Each user is
 allowed to see his own row, including password, but not anyone else's.
 EXPLAIN side-channel might give pretty accurate information of how many
 rows there is in the table, and via clever EXPLAIN+statistics probing
 you might be able to find out what the top-10 passwords are, for
 example. But if you wanted to know what your neighbor's password is, the
 side-channels would not help you much, but an error message would reveal
 it easily.

 Using only built-ins, could you elaborate on how one could pick exactly
 what row was revealed using an error case? That strikes me as
 difficult, but perhaps I'm not thinking creatively enough.
 
 WHERE should do it:
 
 SELECT * FROM secrets_view WHERE username = 'neighbor' AND 
 password::integer = 1234;
 ERROR: invalid input syntax for integer: neighborssecretpassword
 
 Assuming that username = 'neighbor' is evaluated before the cast.
 

In this case, is it unnecessary to expose the given argument in
the error message (from security perspective), isn't it?
Because it is basically matter of the integer input handler,
it seems to me what we should fix up is int4in(), not optimizer.

Perhaps, we should categorize the issued functionalities base on
the level of its threat when abused.

* High-threat
Functions have side-effect that allows to move the given arguments
into another tables or other high-bandwidth chennel.
E.g) lowrite(), pg_write_file()

 - It should be fixed soon.

* Middle-threat
Functions have side-effect that allows to move the given arguments
using error messages or other low-bandwidth channel.
E.g) int4in()

 - It should be fixed in long term.

* Row-threat
Functions can imply existence of invisible tuples, but it does not
expose the value itself.
E.g) EXPLAIN statement, PK/FK constraints

 - It should not be fixed in PostgreSQL.

Now we allow all of them.
But isn't it valuable to fix the high-threat first?
Then, we can revise error messages in built-in functions, I think.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

(2010/06/07 21:56), Stephen Frost wrote:
 * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
 WHERE should do it:

 SELECT * FROM secrets_view WHERE username = 'neighbor' AND
 password::integer = 1234;
 ERROR:  invalid input syntax for integer: neighborssecretpassword

 Assuming that username = 'neighbor' is evaluated before the cast.
 
 Fair enough, so we can't allow built-ins either, except perhaps in very
 specific/limited situations.  Still, if we track that the above WHERE
 and password::integer calls *should* be run as role X, while the view
 should run as role Y, maybe we can at least identify the case where
 we've ended up in a situation where we are going to expose unintended
 data.  I don't know enough about the optimizer or the planner to have
 any clue how we might teach them to actually avoid doing such, though I
 certainly believe it could end up being a disaster on performance based
 on comments from others who know better. :)
 

My opinion is that it is a matter in individual functions, not optimizer.
Basically, built-in functions *should* be trusted, because our security
mechanism is not designed to prevent anything from malicious internal
binary modules.

Historically, we have not known the risk to leak invisible information
using error messages for a long time, so most of internal functions have
not been designed not to return users unnecessary information.
If so, it needs to revise error messages, but it will not complete with
a single commit.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgstatindex still throws ERROR: value 3220078592 is out of range for type integer

2010-06-07 Thread Takahiro Itagaki


Dave Cramer p...@fastcrypt.com wrote:

 I noted on line 169 that max_avail is still an int ? Where else would
 it be having problems ?

It should not a problem because the local variable only stores byte
size in a page. It will be at most only BLCKSZ (=8192).

I wonder why you had ERROR: value ... is out of range for type integer
message because we don't use any integer data types for sizes in
pgstatindex. The error should have been thrown by SQL typin functions
rather than C routines.

CREATE OR REPLACE FUNCTION pgstatindex(IN relname text,
OUT version INT,
OUT tree_level INT,
OUT index_size BIGINT,
OUT root_block_no BIGINT,
OUT internal_pages BIGINT,
OUT leaf_pages BIGINT,
OUT empty_pages BIGINT,
OUT deleted_pages BIGINT,
OUT avg_leaf_density FLOAT8,
OUT leaf_fragmentation FLOAT8)
AS 'MODULE_PATHNAME', 'pgstatindex'
LANGUAGE C STRICT;

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Working with PostgreSQL enums in C code

2010-06-07 Thread Andrew Dunstan




Joseph Adams wrote:


By the way, just curious: why can't the internal representation of an
enum just be an INT starting from 0 by default, like in C?  That would
make a heck of a lot more sense, in my opinion.  It might also allow
users to do things like this in the future:


  


Please review the debates over the internal representation from several 
years ago when enums were implemented. Essentially the difficulty is 
that the output function needs to get nothing more than the value 
itself, and that means the representation needs to carry with it some 
information about *which* enum set it is in.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

KaiGai Kohei kai...@ak.jp.nec.com writes:
 In this case, is it unnecessary to expose the given argument in
 the error message (from security perspective), isn't it?

Yes, if all you care about is security and not usability, that looks
like a great solution.  We're *not* doing it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

(2010/06/08 9:23), KaiGai Kohei wrote:
 (2010/06/07 21:56), Stephen Frost wrote:
 * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
 WHERE should do it:

 SELECT * FROM secrets_view WHERE username = 'neighbor' AND
 password::integer = 1234;
 ERROR:  invalid input syntax for integer: neighborssecretpassword

 Assuming that username = 'neighbor' is evaluated before the cast.

 Fair enough, so we can't allow built-ins either, except perhaps in very
 specific/limited situations.  Still, if we track that the above WHERE
 and password::integer calls *should* be run as role X, while the view
 should run as role Y, maybe we can at least identify the case where
 we've ended up in a situation where we are going to expose unintended
 data.  I don't know enough about the optimizer or the planner to have
 any clue how we might teach them to actually avoid doing such, though I
 certainly believe it could end up being a disaster on performance based
 on comments from others who know better. :)

 
 My opinion is that it is a matter in individual functions, not optimizer.
 Basically, built-in functions *should* be trusted, because our security
 mechanism is not designed to prevent anything from malicious internal
 binary modules.
 
Sorry, it does not mean *all* the built-in functions could be trusted.
Some of built-in ones cannot be trusted from definitions, such as
lowrite().

Perhaps, it eventually needs a flag in the pg_proc to mark a function
being either trusted or untrusted. Then, planner may be able to check
the flag to decide whether is can be pushed down, or not.

If so, we can mark int4in() as trusted, when we revise the issue of
error message. I think the idea makes this problem more simple.

Thanks,

 Historically, we have not known the risk to leak invisible information
 using error messages for a long time, so most of internal functions have
 not been designed not to return users unnecessary information.
 If so, it needs to revise error messages, but it will not complete with
 a single commit.
 
 Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Functional dependencies and GROUP BY

2010-06-07 Thread Hitoshi Harada

2010/6/8 Peter Eisentraut pete...@gmx.net:
 I have developed a patch that partially implements the functional
 dependency feature that allows some columns to be omitted from the
 GROUP BY clause if it can be shown that the columns are functionally
 dependent on the columns in the group by clause and therefore guaranteed
 to be unique per group.  The full functional dependency deduction rules
 are pretty big and arcane, so I concentrated on getting a useful subset
 working.  In particular:

 Also, when a column is compared with a constant, it can appear
 ungrouped:

 SELECT x, y FROM tab2 WHERE y = 5 GROUP BY x;

I don't see why it should be allowed. I see the insist that y must be
unique value so it is ok to be ungrouped but the point of discussion
is far from that; Semantically y is not grouping key.

In addition, what if y is implicitly a constant? For example,

SELECT x, y FROM tab2 WHERE y = a AND a = 5 GROUP BY x;

or there should be more complicated example including JOIN cases. I
don't believe we can detect all of such cases. If the simple case is
allowed, users don't understand why the complicated case doesn't allow
sometimes. So it'll not be consistent.

Finally, it may hide unintended bugs. ORM tools may make WHERE clause
in some conditions and don't in other conditions.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to get permission to debug postgres?

On Sat, Jun 5, 2010 at 8:16 AM, Randy Solomonson ra...@solomonson.com wrote:
I have a system that uses C methods as functions in a Postgres
database. When I try to run gdb with my user account I get this:
ptrace: Operation not permitted

It looks like a permission thing. It seemed to work when I ran gdb as
the postgres user. I would like to continue running postgres as the
postgres user and not have to su postgres when I want to debug. Can
I somehow grant my user account ptrace permission for the process in
question? Adding myself to the postgres group didn't seem to help.

I found this on the ptrace man page:

EPERM The specified process cannot be traced. This could be because
the parent has insufficient privileges (the required capability
is CAP_SYS_PTRACE); non-root processes cannot trace processes
that they cannot send signals to or those running set-user-
ID/set-group-ID programs, for obvious reasons. Alternatively,
the process may already be being traced, or be init(8) (PID 1).

So do I just need to give CAP_SYS_TRACE privileges to my user? If so, how?

This isn't really a PostgreSQL-specific question but more of a general
Linux admin question, and I confess I don't really know the answer. I
*think* that capabilities are more intended to be set on particular
executables, e.g. give ping the ability to open raw sockets without
making it setuid root, rather than to particular users, but I'm not
really sure.

In terms of PG development, I've found that it's much simpler to start
up a test instance of PG under my own user account and then do my
debugging from there.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Greg Stark

On Tue, Jun 8, 2010 at 1:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 KaiGai Kohei kai...@ak.jp.nec.com writes:
 In this case, is it unnecessary to expose the given argument in
 the error message (from security perspective), isn't it?

 Yes, if all you care about is security and not usability, that looks
 like a great solution.  We're *not* doing it.

It's possible to take a more nuanced angle on this approach. You could
imagine a flag indicating whether the call is within a SECURE VIEW
which if enabled caused error messages to elide data taken from
arguments. In order for this not to destroy the code legibility I
think you would need to design some new %-escapes for error messages
to indicate such arguments or some similar trick. You wouldn't want to
have to put extra conditionals everywhere. And I'm not sure where to
conveniently put such a flag so it would have the right semantics.

It would still be a lot of work and a big patch, but mostly mechanical
and it wouldn't impact usability for any case where it wasn't
necessary. It would still have the problem described earlier that we
would keep finding new omissions for years. I can't see us
implementing variable taint tracking in C to do it automatically
though. Perhaps we could get it from one of the static analysis tools.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

(2010/06/08 9:46), Tom Lane wrote:
 KaiGai Koheikai...@ak.jp.nec.com  writes:
 In this case, is it unnecessary to expose the given argument in
 the error message (from security perspective), isn't it?
 
 Yes, if all you care about is security and not usability, that looks
 like a great solution.  We're *not* doing it.
 
Sorry, are you saying we should not revise error messages because
of usability??

If so, and if we decide the middle-threat also should be fixed,
it is necessary to distinguish functions trusted and untrusted,
even if a function is built-in.
Perhaps, pg_proc takes a new flag to represent it.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

KaiGai,

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 Perhaps, pg_proc takes a new flag to represent it.

Without an actual well-formed approach for dealing with this which
requires it, it's far too soon to be asking for changes in the catalog.
Please stop suggesting individual maybe-this-would-help changes.  We
need an actual solution which addresses all, or at least most, of the
concerns described, followed by a patch which implements the mechanics
of the solution.  If the solution calls for an additional pg_proc field,
then the patch should include it.

Not sure if this would translate well, but asking for new flags to be
added to pg_proc right now is putting the cart before the horse.  We
don't even know which functions we might mark as trusted or not yet, nor
is it even clear that adding such a flag would actually help.  Adding a
flag to pg_proc is certainly nothing like a solution to this problem by
itself.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 If we were actually going in this direction we'd
 want to write a much better WAL-text-dump tool than we have, and then
 in principle somebody could sanitize the text output before shipping
 it off.  But going through a large volume of data that way could be
 pretty impractical.  Also, we (or at least I) have nearly zip experience
 with trying to debug problems by examining WAL, so it's not real clear
 to me which details might be important.

Just an off-the-wall thought, but, would it be possible to have a tool
which read WAL backwards and compared entries in the WAL against entries
on disk?  I realize that you'd only see one version of a particular
block and then have to skip any updates which are earlier than it, but
it seems like you could cover a pretty large chunk of the recent changes
to the database using this approach..

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Functional dependencies and GROUP BY

* Hitoshi Harada (umi.tan...@gmail.com) wrote:
 I don't see why it should be allowed. I see the insist that y must be
 unique value so it is ok to be ungrouped but the point of discussion
 is far from that; Semantically y is not grouping key.

Ignoring the fact that it's terribly useful- isn't it part of the SQL
spec?

 In addition, what if y is implicitly a constant? For example,
 
 SELECT x, y FROM tab2 WHERE y = a AND a = 5 GROUP BY x;

Not sure I see the issue here?

 Finally, it may hide unintended bugs. ORM tools may make WHERE clause
 in some conditions and don't in other conditions.

Yeah, this one I really just done buy..  If an ORM tool doesn't write
correct SQL, then it's the ORM's fault, not ours.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Functional dependencies and GROUP BY

* Peter Eisentraut (pete...@gmx.net) wrote:
 This is frequently requested by MySQL converts (and possibly others).

I'd certainly love to see it- but let's not confuse people by implying
that it would actually act the way MySQL does.  It wouldn't, because
what MySQL does is alot closer to 'distinct on' and is patently insane
to boot.  Again, I'd *love* to see this be done in PG, but when we
document it and tell people about it, *please* don't say it's similar in
any way to MySQL's oh, we'll just pick a random value from the columns
that aren't group'd on implementation.

 At this point, this patch could use some review and testing with unusual
 queries that break my implementation. ;-)

I'll give it a shot... :)

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

(2010/06/08 10:17), Stephen Frost wrote:
 KaiGai,
 
 * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 Perhaps, pg_proc takes a new flag to represent it.
 
 Without an actual well-formed approach for dealing with this which
 requires it, it's far too soon to be asking for changes in the catalog.
 Please stop suggesting individual maybe-this-would-help changes.  We
 need an actual solution which addresses all, or at least most, of the
 concerns described, followed by a patch which implements the mechanics
 of the solution.  If the solution calls for an additional pg_proc field,
 then the patch should include it.
 
OK, it was too implementation-specific.

Please return to the categorization with 3-level that I mentioned at
the previous message.

I guess nobody opposes to disable pushing down on functions categorized
to high-threat, such as lowrite() and so on.
It actually can move given arguments to other tables, files, ...

And, I guess nobody wants to tackle an issue categorized to low-threat,
such as EXPLAIN, PK/FK constraints and so on.
It can imply existence of invisible objects, but no leaks of actual value.


Our headache is on functions categorized to middle-threat. It enables to
leak the given arguments using error messages. Here are several ideas,
but they have good and bad points.

At first, it is necessary whether we should fix up the threat in this
level, or not. It seems to me majority of the -hackers want to fix both
of high and middle level.

If we should fix up the issue, I think we have only two options semantically.

(A) It prevents leaky functions to be pushed down, so no invisible
information will be provided. But it makes performance regression.

(B) It prevents leaky functions to raise an error, although we allow
it to be pushed down. But is needs large scale of code changes.

Of course, it has trade-off. As TCSEC mentioned, we are facing with the
large potential cost of reducing the bandwidths of such covert channel

 Not sure if this would translate well, but asking for new flags to be
 added to pg_proc right now is putting the cart before the horse.  We
 don't even know which functions we might mark as trusted or not yet, nor
 is it even clear that adding such a flag would actually help.  Adding a
 flag to pg_proc is certainly nothing like a solution to this problem by
 itself.

For built-in functions, the code should be reviewed to ensure it does not
expose the given argument using error messages.
Then, we can mark it as trusted.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010/6/7 KaiGai Kohei kai...@ak.jp.nec.com:
 Our headache is on functions categorized to middle-threat. It enables to
 leak the given arguments using error messages. Here are several ideas,
 but they have good and bad points.

I think we are altogether off in the weeds here.  We ought to start
with an implementation that pushes nothing down, and then try to
figure out how much we can relax that without too much compromising
security.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

* Robert Haas (robertmh...@gmail.com) wrote:
 2010/6/7 KaiGai Kohei kai...@ak.jp.nec.com:
  Our headache is on functions categorized to middle-threat. It enables to
  leak the given arguments using error messages. Here are several ideas,
  but they have good and bad points.
 
 I think we are altogether off in the weeds here.  We ought to start
 with an implementation that pushes nothing down, and then try to
 figure out how much we can relax that without too much compromising
 security.

I agree with this- and it's more-or-less what I was trying to propose in
my previous comments.  I'm not even sure we need to focus on not pushing
anything down at this point- I'd start with trying to get enough
information passed around/through the system to even *identify* the case
where there's a problem in the first place..

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

(2010/06/08 11:15), Robert Haas wrote:
 2010/6/7 KaiGai Koheikai...@ak.jp.nec.com:
 Our headache is on functions categorized to middle-threat. It enables to
 leak the given arguments using error messages. Here are several ideas,
 but they have good and bad points.
 
 I think we are altogether off in the weeds here.  We ought to start
 with an implementation that pushes nothing down, and then try to
 figure out how much we can relax that without too much compromising
 security.
 
It seems to me fair enough.
I think we can adjust what functions are harmless, and whats are not later.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

For the sake of clarity..

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 OK, it was too implementation-specific.

No, that wasn't the problem.  There isn't an actual implementation yet
for it to be too-specific on.  The problem is that proposing a change to
the catalog without figuring out what it'd actually be used for in an
overall solution is a waste of time. 

 Please return to the categorization with 3-level that I mentioned at
 the previous message.

As Robert said, we're off in the weeds here.  I'm not convinced that
we've got 3 levels, for starters.  It could well be fewer, or more.
Let's stop making assumptions about what's OK and what's not OK.

 For built-in functions, the code should be reviewed to ensure it does not
 expose the given argument using error messages.
 Then, we can mark it as trusted.

One thing that I think *is* clear- removing useful information from
error messages is *not* going to be an acceptable solution.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

(2010/06/08 11:28), Stephen Frost wrote:
 For the sake of clarity..
 
 * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 OK, it was too implementation-specific.
 
 No, that wasn't the problem.  There isn't an actual implementation yet
 for it to be too-specific on.  The problem is that proposing a change to
 the catalog without figuring out what it'd actually be used for in an
 overall solution is a waste of time.
 
Indeed,

 Please return to the categorization with 3-level that I mentioned at
 the previous message.
 
 As Robert said, we're off in the weeds here.  I'm not convinced that
 we've got 3 levels, for starters.  It could well be fewer, or more.
 Let's stop making assumptions about what's OK and what's not OK.
 
Indeed, we may find out the 4th category in the future.

 For built-in functions, the code should be reviewed to ensure it does not
 expose the given argument using error messages.
 Then, we can mark it as trusted.
 
 One thing that I think *is* clear- removing useful information from
 error messages is *not* going to be an acceptable solution.
 
Even if it is conditional, like as Greg Stark suggested?

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SR slaves and .pgpass

On Tue, Jun 8, 2010 at 12:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 On Mon, Jun 7, 2010 at 5:42 AM, Andrew Dunstan and...@dunslane.net wrote:
 I tried this with a database name of replication in the .pgpass file,
 which matches what we need to use in pg_hba.conf, but it failed miserably,
 and only worked when I used a wildcard for the database name in the .pgpass
 file.

 If this is expected it needs to be documented more clearly; if not, it's a
 bug.

 Yep, this is expected, so we need to improve the doc.

 Why don't we improve the code, instead?  In particular make
 libpqrcv_connect() do

 -       snprintf(conninfo_repl, sizeof(conninfo_repl), %s replication=true, 
 conninfo);
 +       snprintf(conninfo_repl, sizeof(conninfo_repl), %s 
 database=replication replication=true, conninfo);

What if the real database named replication exists? How can we
specify the password only for replication purpose in that case?

BTW, to distinguish the replication connection from the connection
to the real database named replication, I proposed changing the
.pgpass code so that it accepts the keyword only for replication,
like pg_hba.conf. But it was rejected, and as the result of the
discussion, we had consensus to not change the code.
http://archives.postgresql.org/pgsql-hackers/2010-01/msg00400.php

 I don't think it's unlikely that someone would try to enter a
 replication-specific password into ~/.pgpass.

Agreed.

But I think that we don't need to specify other than the wildcard
in the database field of .pgpass to use the replication-specific
password if the replication-specific user is supplied in .pgpass.
So the current code is enough for me. Am I missing something?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SR slaves and .pgpass

Fujii Masao masao.fu...@gmail.com writes:
 But I think that we don't need to specify other than the wildcard
 in the database field of .pgpass to use the replication-specific
 password if the replication-specific user is supplied in .pgpass.
 So the current code is enough for me. Am I missing something?

You're looking at it from the perspective of somebody who knows
exactly how the code works.  What Andrew tried is exactly what
95% of other people would try.  There doesn't seem to me to be
any very good argument against making it work for them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Functional dependencies and GROUP BY

Peter Eisentraut pete...@gmx.net writes:
 I have developed a patch that partially implements the functional
 dependency feature that allows some columns to be omitted from the
 GROUP BY clause if it can be shown that the columns are functionally
 dependent on the columns in the group by clause and therefore guaranteed
 to be unique per group.

The main objection to this is the same one I've had all along: it makes
the syntactic validity of a query dependent on what indexes exist for
the table.  At minimum, that means that enforcing the check at parse
time is the Wrong Thing.

The var-compared-with-constant case seems like a big crock.  Are we
really required to provide such a thing per spec?

I'm also fairly concerned about the performance of a check implemented
this way --- it's going to do a lot of work, and do it over and over
again as it traverses the query tree.  At least some of that could be
alleviated after you move the check to the planner, just by virtue of
the index information already having been acquired ... but I'd still
suggest expending more than no effort on caching the results.  For
instance, given SELECT * FROM a_very_wide_table GROUP BY pk you
shouldn't have to prove more than once that a_very_wide_table is
grouped by its PK.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SR slaves and .pgpass

On Tue, Jun 8, 2010 at 12:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 But I think that we don't need to specify other than the wildcard
 in the database field of .pgpass to use the replication-specific
 password if the replication-specific user is supplied in .pgpass.
 So the current code is enough for me. Am I missing something?

 You're looking at it from the perspective of somebody who knows
 exactly how the code works.  What Andrew tried is exactly what
 95% of other people would try.  There doesn't seem to me to be
 any very good argument against making it work for them.

Hmm.. is it worth going back to my proposal?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SR slaves and .pgpass