Re: [HACKERS] [WIP PATCH] for Performance Improvement in Buffer Management

2012-10-22 Thread Amit Kapila
On Saturday, October 20, 2012 11:07 PM  Jeff Janes wrote:
 On Fri, Oct 19, 2012 at 11:00 PM, Amit kapila amit.kap...@huawei.com
 wrote:
 
  Robert wrote an accounting patch a while ago that tallied how often a
  buffer was cleaned but then reclaimed for the same page before being
  evicted.  But now I can't find it.  If you can find that thread,
 there
  might be some benchmarks posted to it that would be useful.
 
  In my first level search, I am also not able to find it. But now I am
 planning to check all
  mails of Robert Haas on PostgreSQL site (which are approximately
 13,000).
  If you can tell me how long ago approximately (last year, 2 yrs back,
 ..) or whether such a patch is submitted
  to any CF or was just discussed in mail chain, then it will be little
 easier for me.
 
 It was just an instrumentation patch for doing experiments, not
 intended for commit.
 
 I've tracked it down to the thread Initial 9.2 pgbench write
 results.  But I don't think it applies to the -S benchmark, because
 it records when the background writer cleaned a buffer by finding it
 dirty and writing it out to make it clean, while in this situation we
 would need something more like either made the buffer clean and
 reusable, observed the buffer to already be clean and reusable

Do you think an instrumentation patch which can give us how many times a
buffer is found by Clock Sweep and how many times it's found from freelist
will be useful?
I have written something on similar lines when I was testing this patch to
find out how many times this patch can avoid clock sweep.
My observation was that although the new implementation saves many cycles of
clock sweep, but still with shared buffers upto 2,2.5G there is no visible
performance gain.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump --split patch

2012-10-22 Thread Marko Tiikkaja

Hi,

Now that the (at least as far as I know) last ordering problem in 
pg_dump has been solved [1], I'm going to attempt resurrecting this old 
thread.


It seemed to me that the biggest objections to this patch in the old 
discussions were directed at the implementation, which I have tried to 
improve.  The attached patch implements the actual splitting in a new 
backup format.


The general output scheme looks like this:
   schemaname/OBJECT_TYPES/object_name.sql,

but there are some exceptions.

Overloaded functions are dumped into the same file.  Object names are 
encoded into the POSIX Portable Filename Character Set ([a-z0-9._-]) by 
replacing any characters outside that set with an underscore.


Restoring the dump is supported through an index.sql file containing 
statements which include (through \i) the actual object files in the 
dump directory.


Any thoughts?  Objections on the idea or the implementation?


[1]: 
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7b583b20b1c95acb621c71251150beef958bb603



Regards,
Marko Tiikkaja
*** a/src/bin/pg_dump/Makefile
--- b/src/bin/pg_dump/Makefile
***
*** 19,25  include $(top_builddir)/src/Makefile.global
  override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
  
  OBJS= pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
!   pg_backup_null.o pg_backup_tar.o \
pg_backup_directory.o dumpmem.o dumputils.o compress_io.o $(WIN32RES)
  
  KEYWRDOBJS = keywords.o kwlookup.o
--- 19,25 
  override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
  
  OBJS= pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
!   pg_backup_null.o pg_backup_tar.o pg_backup_split.o \
pg_backup_directory.o dumpmem.o dumputils.o compress_io.o $(WIN32RES)
  
  KEYWRDOBJS = keywords.o kwlookup.o
*** a/src/bin/pg_dump/pg_backup.h
--- b/src/bin/pg_dump/pg_backup.h
***
*** 50,56  typedef enum _archiveFormat
archCustom = 1,
archTar = 3,
archNull = 4,
!   archDirectory = 5
  } ArchiveFormat;
  
  typedef enum _archiveMode
--- 50,57 
archCustom = 1,
archTar = 3,
archNull = 4,
!   archDirectory = 5,
!   archSplit = 6
  } ArchiveFormat;
  
  typedef enum _archiveMode
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
***
*** 2125,2130  _allocAH(const char *FileSpec, const ArchiveFormat fmt,
--- 2125,2134 
case archTar:
InitArchiveFmt_Tar(AH);
break;
+   
+   case archSplit:
+   InitArchiveFmt_Split(AH);
+   break;
  
default:
exit_horribly(modulename, unrecognized file format 
\%d\\n, fmt);
*** a/src/bin/pg_dump/pg_backup_archiver.h
--- b/src/bin/pg_dump/pg_backup_archiver.h
***
*** 369,374  extern void InitArchiveFmt_Custom(ArchiveHandle *AH);
--- 369,375 
  extern void InitArchiveFmt_Null(ArchiveHandle *AH);
  extern void InitArchiveFmt_Directory(ArchiveHandle *AH);
  extern void InitArchiveFmt_Tar(ArchiveHandle *AH);
+ extern void InitArchiveFmt_Split(ArchiveHandle *AH);
  
  extern bool isValidTarHeader(char *header);
  
*** /dev/null
--- b/src/bin/pg_dump/pg_backup_split.c
***
*** 0 
--- 1,1063 
+ /*-
+  *
+  * pg_backup_split.c
+  *
+  *  A split format dump is a directory, which contains all database objects
+  *  separated into .sql files, and an index.sql file with psql statements
+  *  to allow restoring the separated objects.
+  *
+  *-
+  */
+ 
+ #include postgres_fe.h
+ #include libpq-fe.h
+ #include libpq/libpq-fs.h
+ #include pg_backup_archiver.h
+ #include dumpmem.h
+ #include dumputils.h
+ 
+ #include dirent.h
+ #include sys/stat.h
+ 
+ typedef struct
+ {
+   char   *filename;   /* filename excluding the directory 
(basename) */
+   DumpId  dumpId; /* dump id of the TocEntry */
+ } lclTocEntry;
+ 
+ typedef struct
+ {
+   /*
+* Our archive location. This is basically what the user specified as 
his
+* backup file but of course here it is a directory.
+*/
+   char   *directory;
+ 
+   FILE   *dataFH; /* currently open data file */
+ 
+   lclTocEntry **sortedToc;/* array of toc entires sorted by 
(filename, dumpId) */
+ } lclContext;
+ 
+ /* translator: this is a module name */
+ static const char *modulename = gettext_noop(split archiver);
+ 
+ 
+ /* prototypes for private functions */
+ static void _ArchiveEntry(ArchiveHandle *AH, TocEntry *te);
+ static void _StartData(ArchiveHandle *AH, TocEntry *te);
+ static void _EndData(ArchiveHandle *AH, TocEntry *te);
+ static size_t _WriteData(ArchiveHandle *AH, const void *data, 

Re: [HACKERS] Deprecating RULES

2012-10-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 The problems with MERGE are mostly around concurrency, as far as I can
 tell.  I can't see why RULEs would have anything to do with it -
 except that I don't see how MERGE can sanely support rules, and even
 if we find a way to make it do that, anyone already using RULEs will
 need to adjust them to support MERGE.  I'm not sure I have a horribly
 well-thought-out position on the underlying issue here - I'm kind of
 vacillating back and forth - but I do think one of the problems with
 RULEs is that they are too tied to particular command names.  Adding
 any new commands that can select or modify data - be it MERGE, UPSERT,
 or whatever - is going to cause trouble both for implementors and for
 people relying on the feature.

And triggers (or anything else) would be better on that score because ...?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?

2012-10-22 Thread Pavel Stehule
Hello

I should to search distinct values based on similarity

postgres=# explain select  nazobce, nazobce - 'Benešov' from obce
order by nazobce - 'Benešov' limit 10
;
QUERY PLAN
---
 Limit  (cost=0.00..0.86 rows=10 width=10)
   -  Index Scan using obce_nazobce_idx on obce  (cost=0.00..1433.14
rows=16644 width=10)
 Order By: (nazobce - 'Benešov'::text)
(3 rows)

Time: 0.576 ms

but using DISTINCT breaks KNN searching optimization

postgres=# explain select distinct nazobce, nazobce - 'Benešov' from
obce order by nazobce - 'Benešov' limit 10
;
 QUERY PLAN
-
 Limit  (cost=600.45..600.47 rows=10 width=10)
   -  Sort  (cost=600.45..613.80 rows=5341 width=10)
 Sort Key: ((nazobce - 'Benešov'::text))
 -  HashAggregate  (cost=418.27..485.03 rows=5341 width=10)
   -  Seq Scan on obce  (cost=0.00..335.05 rows=16644 width=10)
(5 rows)

Regards

Pavel Stehule


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] Row-Level Security

2012-10-22 Thread Robert Haas
On Thu, Oct 18, 2012 at 2:19 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Kohei KaiGai escribió:
 The revised patch fixes the problem that Daen pointed out.

 Robert, would you be able to give this latest version of the patch a
 look?

Yeah, sorry I've been completely sidelined this CommitFest. It's been
a crazy couple of months.  Prognosis for future craziness reduction
uncertain.  Comments:

The documentation lists several documented limitations that I would
like to analyze a little bit.  First, it says that row-level security
policies are not applied on UPDATE or DELETE.  That sounds downright
dangerous to me.  Is there some really compelling reason we're not
doing it?  Second, it says that row-level security policies are not
currently applied on INSERT, so you should use a trigger, but implies
that this will change in the future.  I don't think we should change
that in the future; I think relying on triggers for that case is just
fine.  Note that it could be an issue with the post-image for UPDATES,
as well, and I think the trigger solution is similarly adequate to
cover that case.  With respect to the documented limitation regarding
DECLARE/FETCH, what exactly will happen?  Can we describe this a bit
more clearly rather than just saying the behavior will be
unpredictable?

It looks suspiciously as if the row-level security mode needs to be
saved and restored in all the same places we call save and restore the
user ID and security context.  Is there some reason the
row-level-security-enabled flag shouldn't just become another bit in
the security context?  Then we'd get all of this save/restore logic
mostly for free.

ATExecSetRowLevelSecurity() calls SetRowLevelSecurity() or
ResetRowLevelSecurity() to update pg_rowlevelsec, but does the
pg_class update itself.  I think that all of this logic should be
moved into  a single function, or at least functions in the same file,
with the one that only updates pg_rowlevelsec being static and
therefore not able to be called from outside the file.  We always need
the pg_class update and the pg_rowlevelsec update to happen together,
so it's not good to have an exposed function that does one of those
updates but not the other.  I think the simplest thing is just to move
ATExecSetRowLevelSecurity to pg_rowlevelsec.c and rename it to
SetRowLevelSecurity() and then give it two static helper functions,
say InsertPolicyRow() and DeletePolicyRow().

I think it would be good if Tom could review the query-rewriting parts
of this (viz rowlevelsec.c) as I am not terribly familiar with this
machinery, and of course anything we get wrong here will have security
consequences.  At first blush, I'm somewhat concerned about the fact
that we're trying to do this after query rewriting; that seems like it
could break things.  I know KaiGai mentioned upthread that the
rewriter won't be rerun if the plan is invalidated, but (1) why is
that OK now? and (2) if it is OK now, then why is it OK to do
rewriting of the RLS qual - only - after rewriting if all of the rest
of the rewriting needs to happen earlier?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-22 Thread Robert Haas
On Mon, Oct 22, 2012 at 8:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 The problems with MERGE are mostly around concurrency, as far as I can
 tell.  I can't see why RULEs would have anything to do with it -
 except that I don't see how MERGE can sanely support rules, and even
 if we find a way to make it do that, anyone already using RULEs will
 need to adjust them to support MERGE.  I'm not sure I have a horribly
 well-thought-out position on the underlying issue here - I'm kind of
 vacillating back and forth - but I do think one of the problems with
 RULEs is that they are too tied to particular command names.  Adding
 any new commands that can select or modify data - be it MERGE, UPSERT,
 or whatever - is going to cause trouble both for implementors and for
 people relying on the feature.

 And triggers (or anything else) would be better on that score because ...?

Well, my thought was that a trigger - at least a row-level trigger -
can presumably be fired on the basis of whether an individual row is
being insert or updated, rather than on whether the statement is named
INSERT or UPDATE.  If that's not correct, we've got some
head-scratching to do...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 8/8] Introduce wal decoding via catalog timetravel

2012-10-22 Thread Kevin Grittner
Simon Riggs wrote:
 Greg Stark st...@mit.edu wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Isn't there an even more serious problem, namely that this
 assumes *all* transactions are serializable?

Do you mean in terms of the serializable transaction isolation level,
or something else?

I haven't read the patches, but I've been trying to follow the
discussion and I don't recall any hint of basing this on serializable
transactions on each source. Of course, when it comes down to
commits, both where a change is committed and where the work is
copied, there must be a commit order; and with asynchronous work
where data isn't partitioned such that there is a single clear owner
for each partition there will be conflicts which must be resolved.  I
don't get the impression that this point has been lost on Simon and
Andres.

 What happens when they aren't? Or even just that the effective
 commit order is not XID order?

 Firstly, I haven't read the code but I'm confident it doesn't make
 the elementary error of assuming commit order == xid order. I
 assume it's applying the reassembled transactions in commit order.

Same here.

 I don't think it assumes the transactions are serializable because
 it's only concerned with writes, not reads. So the transaction
 it's replaying may or may not have been able to view the data
 written by other transactions that commited earlier but it doesn't
 matter when trying to reproduce the effects using constants.

IIRC, the developers of this feature have explicitly said that they
will defer any consideration of trying to extend serializable
transaction isolation behavior to a multi-server basis until after
they have other things working. (Frankly, to do otherwise would not
be sane.) It appears to me that it can't be managed in a general
sense without destroying almost all the advantages of multi-master
replication, at least (as I said before) where data isn't partitioned
such that there is a single clear owner for each partition.

Where such partitioning is present and there are data sets maintained
exclusively by serializable transactions, anomaly-free reads of the
data could be accomplished by committing transactions on the replicas
in apparent order of execution rather than commit order. Apparent
order of execution must take both commit order and read-write
dependencies into consideration.

 The data written by this transaction is either written or not when
 the commit happens and it's all written or not at that time. Even
 in non-serializable mode updates take row locks and nobody can see
 the data or modify it until the transaction commits.

As with read-only transactions and hot standbys, the problem comes in
when a transaction commits and is replicated while a transction
remains uncommitted which is basing its updates on the earlier state
of the data. It gets even more exciting with MMR since the
transaction working with the old version of the data might be on a
different machine, on another continent. With certain types of
workloads, it seems to me that it could get pretty crazy if certain
closely-related actions are not kept within a single database (like
controlling the active batch and adding items to a batch).

In the wild, half-baked, hand-wavey suggestions department -- maybe
there should be some consideration of a long-term way within MMR to
direct activities to certain logical nodes, each of which could be
mapped to a single physical node at any one time. Basically, to route
a request through the MMR network to the current logical node for
handling something, and have the effects ripple back out through all
nodes.

 This uses Commit Serializability, which is valid, as you say.

Well, it is a type of concurrency control with a name and a
definition, if that's what you mean. I agree it provides sufficient
guarantees to create a workable MMR system, if you have adequate
conflict resolution. My concern is that it not be confused with
serializability in the mathematical sense or in the sense of
transaction isolation levels.

In general on this thread, when I've seen the terms serializable
and serializability I haven't been clear on whether the words are
being used in their more general sense as words in the English
language, or in a more particular technical sense.

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecations in authentication

2012-10-22 Thread Stephen Frost
Magnus, all,

* Magnus Hagander (mag...@hagander.net) wrote:
 On Thu, Oct 18, 2012 at 5:59 PM, Robert Haas robertmh...@gmail.com wrote:
  That seems like a sufficiently long deprecation window, but is gssapi
  a full substitute for krb5?  I don't really have a strong opinion on
  this, not being a user myself.
 
 I'm pretty sure that it is.
 
 Stephen, you usually have comments about the Kerberos stuff - want to
 comment on this one? :)

The biggest risk that I can think of regarding deprecating krb5 would be
platforms (if any still exist...) which don't have GSSAPI.  Is it
possible to see that from the buildfarm information or from the
configure results that people have for any strange/different platforms
out there?  The other question would be if we think anyone's actually
using krb5 on those platforms and/or would people in those situations be
willing/able to move to a different library which supports GSSAPI.

I'm all for deprecating krb5 myself, but I wouldn't want to break things
for people without good cause.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM

2012-10-22 Thread Andrew Dunstan


On 10/22/2012 10:18 AM, Robert Haas wrote:

On Sun, Oct 21, 2012 at 11:02 AM, Martijn van Oosterhout
klep...@svana.org wrote:

It bugs me every time you have to jump through hoops and get red
warnings for an unknown CA, whereas no encryption whatsoever is treated
as fine while being actually even worse.

+1.  Amen, brother.



Not really, IMNSHO. The difference is that an unencrypted session isn't 
pretending to be secure. In any case, it doesn't seem too intrusive for 
us to warn, at least in psql, with something like:


SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256) Host 
Certificate Unverified


If people want to get more paranoid they can always set PGSSLMODE to 
verify-ca or verify-full.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM

2012-10-22 Thread Peter Eisentraut
On 10/12/12 3:44 PM, Stephen Frost wrote:
 wrt future-proofing, I don't like the #-of-iterations approach.  There
 are a number of examples of how to deal with multiple encryption types
 being supported by a protocol, I'd expect hash'ing could be done in the
 same way.  For example, Negotiate, SSL, Kerberos, GSSAPI, all have ways
 of dealing with multiple encryption/hashing options being supported.
 Multiple iterations could be supported through that same mechanism (as
 des/des3 were both supported by Kerberos for quite some time).
 
 In general, I think it's good to build on existing implementations where
 possible.  Perhaps we could even consider using something which already
 exists for this?

Sounds like SASL to me.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] pg_ping utility

2012-10-22 Thread Phil Sorber
On Sun, Oct 21, 2012 at 6:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Phil Sorber p...@omniti.com writes:
 Here is the new patch. I renamed the utility from pg_ping to pingdb to
 go along with the naming convention of src/bin/scripts.

 Uh, no, that's not a step forward.  Leaving out a pg prefix from those
 script names is universally agreed to have been a mistake.  We've not
 felt that changing the legacy names is worth the amount of pain it'd
 cause, but that doesn't mean that we should propagate the mistake into
 brand new executable names.

 regards, tom lane

Ok. I asked about this and got no response so I assume there were no
strong opinions. I have reverted to the pg_ping name. Patches
attached.


pg_ping_bin_v2.diff
Description: Binary data


pg_ping_docs_v2.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?

2012-10-22 Thread Pavel Stehule
2012/10/22 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 but using DISTINCT breaks KNN searching optimization

 postgres=# explain select distinct nazobce, nazobce - 'Benešov' from
 obce order by nazobce - 'Benešov' limit 10

 Don't hold your breath.  There are two ways the system could implement
 the DISTINCT clause: either sort and uniq, or hashaggregate.
 hashaggregate will destroy any input ordering, so there's no value in
 using the index as input.  sort and uniq requires the input to be sorted
 by *all* the columns being distinct'ed, not just one, so again this
 index isn't useful.  You could get a plan using the index if you only
 wanted the - output column, eg

 contrib_regression=# explain select distinct t - 'foo' from test_trgm order 
 by t - 'foo' limit 10;
  QUERY PLAN
 -
  Limit  (cost=0.00..0.87 rows=10 width=12)
-  Unique  (cost=0.00..86.75 rows=1000 width=12)
  -  Index Scan using ti on test_trgm  (cost=0.00..84.25 rows=1000 
 width=12)
Order By: (t - 'foo'::text)
 (4 rows)

 Perhaps it would be close enough to what you want to use DISTINCT ON:

 contrib_regression=# explain select distinct on( t - 'foo') *,t - 'foo' 
 from test_trgm order by t - 'foo' limit 10;
  QUERY PLAN
 -
  Limit  (cost=0.00..0.87 rows=10 width=12)
-  Unique  (cost=0.00..86.75 rows=1000 width=12)
  -  Index Scan using ti on test_trgm  (cost=0.00..84.25 rows=1000 
 width=12)
Order By: (t - 'foo'::text)
 (4 rows)

 regards, tom lane

good tip - it's working

thank you

Regards

Pavel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-22 Thread Pavel Stehule
2012/10/22 Tom Lane t...@sss.pgh.pa.us:
 I wrote:
 I tested, and indeed this seems to work:
   CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2);
 and it's perfectly sensible from an English-grammar standpoint too.
 If we take that, how would we spell the table-constraint case exactly?
 Grammatically I'd prefer
   FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES

 Are people happy with these syntax proposals, or do we need some other
 color for the bikeshed?

I am ok

Pavel


 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-22 Thread Andrew Dunstan


On 10/22/2012 12:08 PM, Tom Lane wrote:

I wrote:

I tested, and indeed this seems to work:
CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2);
and it's perfectly sensible from an English-grammar standpoint too.
If we take that, how would we spell the table-constraint case exactly?
Grammatically I'd prefer
FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES

Are people happy with these syntax proposals, or do we need some other
color for the bikeshed?



I can live with it, although the different spelling is slightly jarring.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-22 Thread Andres Freund
On Monday, October 22, 2012 06:08:32 PM Tom Lane wrote:
 I wrote:
  I tested, and indeed this seems to work:
  CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2);
  
  and it's perfectly sensible from an English-grammar standpoint too.
  If we take that, how would we spell the table-constraint case exactly?
  Grammatically I'd prefer
  
  FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES
 
 Are people happy with these syntax proposals, or do we need some other
 color for the bikeshed?

Except that I'd prefer a WHERE in the table-constraint case as well for 
consistencies sake I am unsurprisingly happy with the proposal.

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] Row-Level Security

2012-10-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 The documentation lists several documented limitations that I would
 like to analyze a little bit.  First, it says that row-level security
 policies are not applied on UPDATE or DELETE.  That sounds downright
 dangerous to me.  Is there some really compelling reason we're not
 doing it?

[ blink... ]  Isn't that a security hole big enough for a Mack truck?

UPDATE tab SET foo = foo RETURNING *;

sucks out all the data just fine, if RLS doesn't apply to it.

Having said that, I fear that sensible row-level security for updates is
at least one order of magnitude harder than sensible row-level security
for selects.  We've speculated about how to define that in the past,
IIRC, but without any very satisfactory outcome.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-22 Thread Andrew Dunstan


On 10/22/2012 12:13 PM, Andres Freund wrote:

On Monday, October 22, 2012 06:08:32 PM Tom Lane wrote:

I wrote:

I tested, and indeed this seems to work:
CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2);

and it's perfectly sensible from an English-grammar standpoint too.
If we take that, how would we spell the table-constraint case exactly?
Grammatically I'd prefer

FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES

Are people happy with these syntax proposals, or do we need some other
color for the bikeshed?

Except that I'd prefer a WHERE in the table-constraint case as well for
consistencies sake I am unsurprisingly happy with the proposal.



That would look odd too, especially if the array isn't the last element 
in the FK:



FOREIGN KEY (foo, WHERE EACH ELEMENT OF bar, baz) REFERENCES


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Database object names and libpq in UTF-8 locale on Windows

2012-10-22 Thread Sebastien FLAESCH

Hi all,
I would appreciate some help or comments on this topic...
Is this the wrong mailing list to ask such question?
Seb

On 10/18/2012 10:15 AM, Sebastien FLAESCH wrote:

Hello,

Using PostgreSQL 9.2.1 on Windows, I am facing a strange character set
issue
with a UTF-8 database.

Maybe this is expected, but want to be sure that I am not missing
something.

On Windows, I have created a database with:

ENCODING = 'UTF-8'
LC_COLLATE = 'English_United States.1252'
LC_CTYPE = 'English_United States.1252'

I am using libpq with a simple C program.
The program handles UTF-8 strings.

While it's not a traditional UNICODE Windows application using WCHAR, it
should be possible to send and get back UTF-8 data with PostgreSQL.

Interacting with the Windows system in the system locale is another
story, but
from a pure C / SQL / libpq point of view, as long as the PostgreSQL client
encoding is properly defined to UTF-8, it should work, and it does mainly:

- I can use UTF-8 string constants in my queries.
- I can pass UTF-8 data to the database with parameterized queries.
- I can fetch UTF-8 data from the database.
- I can create db object names with UTF-8 characters.

But the db object names must be specified with double quotes:
When I do not use quoted db object names, I get a strange problem.
The table is created, I can use it in my program, I can even use it in the
pgAdmin query tool, but in the pgAdmin db browser, there is no table name
displayed in the treeview...

Further, when selecting schema information from pg_class, I can see that
the
table exists, but there is nothing displayed in the relname attribute...

It appears that the problem disappears when using a C collation are
char type:

ENCODING = 'UTF-8'
LC_COLLATE = 'C'
LC_CTYPE = 'C'

I suspect this has something to do with the fact that non-quoted
identifiers
are converted to lowercase, and because my LC_CTYPE is English_United
States.1252,
the conversion to lowercase fails...

But:

- why does PostgreSQL accept to create the table with invalid UTF-8
characters?

- why can I make queries in the query tool with the UTF-8 table name?
(note that show client_encoding in the query tool gives me UNICODE -
is this
the same as UTF-8?)


So is there a bug, or do I have to use a C collation and char type for
UTF-8
databases on Windows?

I know that UTF-8 is not supported by the Windows C/POSIX library
(setlocale,
and co).

Does it mean that it's not realistic to use UTF-8 encoding for PostgreSQL
databases on Windows...?

Thanks for reading.
Sebastien FLAESCH
Four Js Development Tools






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Database object names and libpq in UTF-8 locale on Windows

2012-10-22 Thread Andrew Dunstan


On 10/22/2012 12:53 PM, Sebastien FLAESCH wrote:

[Issues with unquoted utf8 identifiers in Windows 1252 locale]


I suspect this has something to do with the fact that non-quoted
identifiers
are converted to lowercase, and because my LC_CTYPE is English_United
States.1252,
the conversion to lowercase fails...



Quite possibly. The code comment says this:

/*
 * SQL99 specifies Unicode-aware case normalization, which we
   don't yet
 * have the infrastructure for.  Instead we use tolower() to
   provide a
 * locale-aware translation.  However, there are some locales
   where this
 * is not right either (eg, Turkish may do strange things with
   'i' and
 * 'I').  Our current compromise is to use tolower() for
   characters with
 * the high bit set, and use an ASCII-only downcasing for 7-bit
 * characters.
 */

For now your best bet is probably not to use UTF8 non-ascii chars or to 
quote the identifiers.


Given we're calling to_lower() on a single byte in the code referred to, 
should we even be doing that when we have a multi-byte encoding and the 
high bit is set?


Aside: I'd love to fix up our treatment of identifiers, but there is 
probably a LOT of very tedious work involved.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?

2012-10-22 Thread Robert Haas
On Mon, Oct 22, 2012 at 11:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavel Stehule pavel.steh...@gmail.com writes:
 but using DISTINCT breaks KNN searching optimization

 postgres=# explain select distinct nazobce, nazobce - 'Benešov' from
 obce order by nazobce - 'Benešov' limit 10

 Don't hold your breath.  There are two ways the system could implement
 the DISTINCT clause: either sort and uniq, or hashaggregate.
 hashaggregate will destroy any input ordering, so there's no value in
 using the index as input.

Isn't that an implementation limitation though, rather than a
fundamental limitation?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] Row-Level Security

2012-10-22 Thread Robert Haas
On Mon, Oct 22, 2012 at 12:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 The documentation lists several documented limitations that I would
 like to analyze a little bit.  First, it says that row-level security
 policies are not applied on UPDATE or DELETE.  That sounds downright
 dangerous to me.  Is there some really compelling reason we're not
 doing it?

 [ blink... ]  Isn't that a security hole big enough for a Mack truck?

 UPDATE tab SET foo = foo RETURNING *;

 sucks out all the data just fine, if RLS doesn't apply to it.

Yep.

 Having said that, I fear that sensible row-level security for updates is
 at least one order of magnitude harder than sensible row-level security
 for selects.  We've speculated about how to define that in the past,
 IIRC, but without any very satisfactory outcome.

Uh, I don't agree.  SELECT and DELETE are pretty much identical cases.
 UPDATE needs all the same stuff that those two cases need, plus it
has an additional problem that it shares with INSERT - namely, someone
might insert a tuple that they cannot see or update a tuple such that
they can no longer see it.  However, both of those problems can be
handled via triggers, for now and maybe forever.  In contrast, the
problem that SELECT has - which UPDATE and DELETE also share - namely,
of rows being visible that should not be - is not nearly so
susceptible to that approach, both for performance reasons and because
there's no such thing as a trigger on SELECT.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM

2012-10-22 Thread Robert Haas
On Mon, Oct 22, 2012 at 10:57 AM, Andrew Dunstan and...@dunslane.net wrote:
 On 10/22/2012 10:18 AM, Robert Haas wrote:
 On Sun, Oct 21, 2012 at 11:02 AM, Martijn van Oosterhout
 klep...@svana.org wrote:

 It bugs me every time you have to jump through hoops and get red
 warnings for an unknown CA, whereas no encryption whatsoever is treated
 as fine while being actually even worse.

 +1.  Amen, brother.

 Not really, IMNSHO. The difference is that an unencrypted session isn't
 pretending to be secure. In any case, it doesn't seem too intrusive for us
 to warn, at least in psql, with something like:

 SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256) Host Certificate
 Unverified

Well, that change wouldn't bother me at all; in fact, I like it.  But
Firefox, for example, makes me do three or four clicks every time I
got to a website with an invalid SSL certificate, whereas a web site
that does not use SSL requires no clicks at all.  What's the sense in
that?  If we imagine that all activity is user-initiated - that is,
the user is always careful to ask for SSL when and only when they need
a higher level of security - then that's pretty sensible.  But in fact
the world doesn't work that way.  Most web pages are downloaded
automatically when you click on a link, and you don't normally look to
see whether SSL is in use unless you have a security concern (e.g.
because you are logging into your bank's web site).  If somebody went
and trojaned my bank's web page, they wouldn't need to break the SSL
certificate; they could just remove SSL from the login page
altogether.  Odds are very good that 95% of people wouldn't notice.

I think it's great to have a full-paranoia mode where anything not
kosher on the SSL connection is grounds for extreme panic.  But it
shouldn't be the default.  What Ubuntu is doing does not solve every
problem, but it does solve some problems, and we shouldn't go out of
our way to break it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Oct 22, 2012 at 12:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 I tested, and indeed this seems to work:
 CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2);
 and it's perfectly sensible from an English-grammar standpoint too.
 If we take that, how would we spell the table-constraint case exactly?
 Grammatically I'd prefer
 FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES

 Are people happy with these syntax proposals, or do we need some other
 color for the bikeshed?

 Well, I can't say I'm very happy with the discrepancy between the two
 syntaxes, but I guess I'm in the minority.  Still, I can't help but
 think it's going to be confusing and hard to remember.  If we don't
 get complaints about it, I'll take that as evidence that the feature
 isn't being used, rather than evidence that the syntax is
 satisfactory.

I'm not thrilled with the inconsistency either, but given the
constraints we're under, it seems like the best we can do.  (I feel,
as Andrew does, that shoving WHERE into the table-constraint syntax
would not be an improvement; but the column-constraint syntax really
needs to start with a fully-reserved word).  Have you got a better
proposal?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-22 Thread Robert Haas
On Mon, Oct 22, 2012 at 12:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 I tested, and indeed this seems to work:
   CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2);
 and it's perfectly sensible from an English-grammar standpoint too.
 If we take that, how would we spell the table-constraint case exactly?
 Grammatically I'd prefer
   FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES

 Are people happy with these syntax proposals, or do we need some other
 color for the bikeshed?

Well, I can't say I'm very happy with the discrepancy between the two
syntaxes, but I guess I'm in the minority.  Still, I can't help but
think it's going to be confusing and hard to remember.  If we don't
get complaints about it, I'll take that as evidence that the feature
isn't being used, rather than evidence that the syntax is
satisfactory.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-22 Thread Kevin Grittner
[I'm replying to Robert's message only because it is the latest on
the thread; I'm actually kinda replying to the whole thread in
general.]

When catching up on a backlog, one would hope that any thread
comprising more than 5% of said backlog would be more constructive. 
:-(

As someone coming in late with no skin in the game, here are my
observations:

(1) A suggestion was made by someone who was unaware of any actual
productive uses of rules (outside of the red herring of the internal
implementation detail regarding views -- which many other products
manage to provide without rules) that we clean up what was assumed to
be old baggage. I viewed the suggestion as having been made in more
or less the same spirit as suggesting cleaning up include directives
which weren't really needed or eliminating the storage manager layer:
sort of a bother, unfortunately some risk, but resulting in cleaner
and more maintainable code in the long run.

(2) My initial gut reaction to the suggestion was positive, as my
only attempt at using rules resulted in some very astonishing and
dangerous behavior in testing. When I asked about it I seem to
remember being told that rules were an old legacy feature which had
no real use and would generally bite you badly when an unexpected
type of query was run against the table with the rule. This matched
my later experiences of seeing people ask questions when they were
bitten or having problems getting rules to work as intended.

(3) A number of people then responded with claims that rules were
useful, but when those ignorant of such uses were curious about
examples, were either given hand-wavey descriptions or angry-sounding
challenges to prove that there were no such uses.

(4) Subsequent discussion has produced a few shadowy hints at what
such uses look like, with the most concrete being a one-time load of
partitions, for which rules are apparently one or two orders of
magnitude faster than FOR EACH ROW truggers. There was also a mention
of writing to a log table being easier. Out of 100+ messages on this
thread, I can't recall anything else that wasn't pretty vague.

(5) Even some of those opposing deprecation say they would like to
see rules go away eventually, once all of the (unspecified) uses have
better alternatives.

(6) There has been an assertion that it is impossible for the people
on the -hackers list to properly identify and enumerate the valid
real-world use-cases for rules; that we need to draw such information
from a wider group.

(7) There has been an aknowledgement that the documentation neither
makes clear where rules might really be useful, nor how they can
produce surprising results, including eating data. (Brainz aside,
I assume we can all agree that a rule can surprise you by eating
*data* you didn't expect it to?)

I can't think of anything I got out of the thread beyond the above.

Given the above, it seems that the first priority should be doing
something about the documentation.

Since as far as I can tell nobody has *any* trouble coming up with
dangerous uses of rules, the hold-up on getting anything else done is
in getting a wide sampling of appropriate and safe usage.

I invite anyone who thinks rules should stay permanently to write a
blog entry on Why Rules Are a Cool Feature. In particular, I would
love to see it include useful examples of what Andrew calls
non-trivial rules on a blog page he wrote:

  http://blog.rhodiumtoad.org.uk/2010/06/21/the-rule-challenge/

Failing that, how do we collect a broad enough sampling of current
usage to be sure we know when we have alternatives for every current
valid usage?

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?

2012-10-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Oct 22, 2012 at 11:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Don't hold your breath.  There are two ways the system could implement
 the DISTINCT clause: either sort and uniq, or hashaggregate.
 hashaggregate will destroy any input ordering, so there's no value in
 using the index as input.

 Isn't that an implementation limitation though, rather than a
 fundamental limitation?

Perhaps, but it's not a simple one to surmount, and I'm dubious about
putting the amount of work that'd be required into such a corner case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-10-22 Thread Claudio Freire
On Thu, Oct 18, 2012 at 7:42 PM, Claudio Freire klaussfre...@gmail.com wrote:
 Fun. That didn't take long.

 With the attached anti-sequential scan patch, and effective_io_concurrency=8:


   QUERY PLAN
 -
  GroupAggregate  (cost=0.00..4149039.04 rows=90257289 width=4) (actual
 time=26.964..84299.789 rows=9001 loops=1)
-  Index Only Scan Backward using pgbench_accounts_pkey on
 pgbench_accounts  (cost=0.00..2795179.71 rows=90257289 width=4)
 (actual time=26.955..62761.774 rows=9001 loops=1)
  Index Cond: ((aid = 1000) AND (aid = 2))
  Heap Fetches: 0
  Total runtime: 87170.355 ms
 I/O thoughput 22MB/s (twice as fast)
 I/O utilization 95% (I was expecting 100% but... hey... good enough)

 With e_i_c=24, it gets to 100% utilization and 30MB/s (that's 3 times
 faster). So, I'd like to know what you think, but maybe for
 back-sequential scans, prefetch should be set to a multiple (ie: x24)
 of e_i_c, in order to exploit read request merges.


Earlier patch had a regression for heap-including scans backwards with
RAID, so I made the back-sequential optimization index-only-only and
now I can find no regression. Make check runs fine, btw.

I hope I'm not talking to myself.


postgresql-git-bt_prefetch_backseq.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM

2012-10-22 Thread Marko Kreen
On Fri, Oct 12, 2012 at 10:47 PM, Stephen Frost sfr...@snowman.net wrote:
 * Marko Kreen (mark...@gmail.com) wrote:
 As it works only on connect
 time, it can actually be secure, unlike user switching
 with SET ROLE.

 I'm guessing your issue with SET ROLE is that a RESET ROLE can be issued
 later..?  If so, I'd suggest that we look at fixing that, but realize it
 could break poolers.  For that matter, I'm not sure how the proposal to
 allow connections to be authenticated as one user but authorized as
 another (which we actually already support in some cases, eg: peer)
 *wouldn't* break poolers, unless you're suggesting they either use a
 separate connection for every user, or reconnect every time, both of
 which strike me as defeating a great deal of the point of having a
 pooler in the first place...

The point of pooler is to cache things.  The TCP connection
is only one thing to be cached, all the backend-internal
caches are as interesting - prepared plans, compiled functions.

The fact that on role reset you need to drop all those things
is what is breaking pooling.

Of course, I'm speaking only about high-performance situations.
Maybe there are cases where indeed the authenticated
TCP connection is only interesting to be cached.
Eg. with dumb client with raw sql only, where there
is nothing to cache in backend.  But it does not seem
like primary scenario we should optimize for.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-22 Thread Jim Nasby

On 10/19/12 1:26 PM, Josh Berkus wrote:

What I'm saying is, we'll get nowhere promoting an application queue
which is permanently inferior to existing, popular open source software.
  My advice: Forget about the application queue aspects of this.  Focus
on making it work for replication and matviews, which are already hard
use cases to optimize.

If someone can turn this feature into the base for a distributed
queueing system later, then great.  But let's not complicate this
feature by worrying about a use case it may never fulfill.


And as someone else mentioned... we should call this a stream and not a queue, 
since this would be lacking in many queue features.

It certainly sounds like a useful framework to have.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-22 Thread Robert Haas
On Mon, Oct 22, 2012 at 2:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not thrilled with the inconsistency either, but given the
 constraints we're under, it seems like the best we can do.  (I feel,
 as Andrew does, that shoving WHERE into the table-constraint syntax
 would not be an improvement; but the column-constraint syntax really
 needs to start with a fully-reserved word).  Have you got a better
 proposal?

Well, I think if that's the best we can do, you original proposal of
ditching the column constraint syntax altogether might be for the
best.  I wasn't too excited about that before, but I think having two
different syntaxes is going to be even worse.  In some ways, it's
actually sort of sensible, because the referring side isn't really the
column itself; it's some value extracted therefrom.  You can imagine
other variants of that as well, such as the recently-suggested

FOREIGN KEY ((somecol).member_name) REFERENCES othertab (doohicky)

Now, what would the column-constraint version of that look like?  Is
it even sensible to think that there SHOULD be a column-constraint
version of that?  I'm not convinced it is sensible, so maybe decreeing
that the table constraint version must be used to handle all
non-trivial cases is more sensible than I initially thought.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Well, I think if that's the best we can do, you original proposal of
 ditching the column constraint syntax altogether might be for the
 best.  I wasn't too excited about that before, but I think having two
 different syntaxes is going to be even worse.  In some ways, it's
 actually sort of sensible, because the referring side isn't really the
 column itself; it's some value extracted therefrom.  You can imagine
 other variants of that as well, such as the recently-suggested

 FOREIGN KEY ((somecol).member_name) REFERENCES othertab (doohicky)

 Now, what would the column-constraint version of that look like?  Is
 it even sensible to think that there SHOULD be a column-constraint
 version of that?  I'm not convinced it is sensible, so maybe decreeing
 that the table constraint version must be used to handle all
 non-trivial cases is more sensible than I initially thought.

I could easily go with that ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?

2012-10-22 Thread Pavel Stehule
2012/10/22 Tom Lane t...@sss.pgh.pa.us:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Oct 22, 2012 at 11:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Don't hold your breath.  There are two ways the system could implement
 the DISTINCT clause: either sort and uniq, or hashaggregate.
 hashaggregate will destroy any input ordering, so there's no value in
 using the index as input.

 Isn't that an implementation limitation though, rather than a
 fundamental limitation?

 Perhaps, but it's not a simple one to surmount, and I'm dubious about
 putting the amount of work that'd be required into such a corner case.

I don't think so this use case is too special - but workaround working well

Regards

Pavel

 regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] Row-Level Security

2012-10-22 Thread Kohei KaiGai
2012/10/22 Robert Haas robertmh...@gmail.com:
 On Thu, Oct 18, 2012 at 2:19 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Kohei KaiGai escribió:
 The revised patch fixes the problem that Daen pointed out.

 Robert, would you be able to give this latest version of the patch a
 look?

 Yeah, sorry I've been completely sidelined this CommitFest. It's been
 a crazy couple of months.  Prognosis for future craziness reduction
 uncertain.  Comments:

 The documentation lists several documented limitations that I would
 like to analyze a little bit.  First, it says that row-level security
 policies are not applied on UPDATE or DELETE.  That sounds downright
 dangerous to me.  Is there some really compelling reason we're not
 doing it?

It intends to simplify the patch to avoid doing everything within a single
patch. I'll submit the patch supporting UPDATE and DELETE for CF-Nov
in addition to the base one.

  Second, it says that row-level security policies are not
 currently applied on INSERT, so you should use a trigger, but implies
 that this will change in the future.  I don't think we should change
 that in the future; I think relying on triggers for that case is just
 fine.  Note that it could be an issue with the post-image for UPDATES,
 as well, and I think the trigger solution is similarly adequate to
 cover that case.

Hmm. I should not have written this in section of the current limitation.
It may give impression the behavior will be changed future.
OK, I'll try to revise the documentation stuff.

  With respect to the documented limitation regarding
 DECLARE/FETCH, what exactly will happen?  Can we describe this a bit
 more clearly rather than just saying the behavior will be
 unpredictable?

In case when user-id was switched after declaration of a cursor that
contains qualifier depending on current_user, its results set contains
rows with old user-id and rows with new user-id.

Here is one other option rather than documentation fix.
As we had a discussion on the upthread, it can be solved if we restore
the user-id associated with the portal to be run, however, a problem is
some commands switches user-id inside of the portal.
http://archives.postgresql.org/pgsql-hackers/2012-07/msg00055.php

Is there some good idea to avoid the problem?

 It looks suspiciously as if the row-level security mode needs to be
 saved and restored in all the same places we call save and restore the
 user ID and security context.  Is there some reason the
 row-level-security-enabled flag shouldn't just become another bit in
 the security context?  Then we'd get all of this save/restore logic
 mostly for free.

It seems to me a good idea, but I didn't find out this.

 ATExecSetRowLevelSecurity() calls SetRowLevelSecurity() or
 ResetRowLevelSecurity() to update pg_rowlevelsec, but does the
 pg_class update itself.  I think that all of this logic should be
 moved into  a single function, or at least functions in the same file,
 with the one that only updates pg_rowlevelsec being static and
 therefore not able to be called from outside the file.  We always need
 the pg_class update and the pg_rowlevelsec update to happen together,
 so it's not good to have an exposed function that does one of those
 updates but not the other.  I think the simplest thing is just to move
 ATExecSetRowLevelSecurity to pg_rowlevelsec.c and rename it to
 SetRowLevelSecurity() and then give it two static helper functions,
 say InsertPolicyRow() and DeletePolicyRow().

OK, I'll rework the code.

 I think it would be good if Tom could review the query-rewriting parts
 of this (viz rowlevelsec.c) as I am not terribly familiar with this
 machinery, and of course anything we get wrong here will have security
 consequences.  At first blush, I'm somewhat concerned about the fact
 that we're trying to do this after query rewriting; that seems like it
 could break things.  I know KaiGai mentioned upthread that the
 rewriter won't be rerun if the plan is invalidated, but (1) why is
 that OK now? and (2) if it is OK now, then why is it OK to do
 rewriting of the RLS qual - only - after rewriting if all of the rest
 of the rewriting needs to happen earlier?

I just follow the existing behavior of plan invalidation; that does not
re-run the query rewriter. So, if we have no particular reason why
we should not run the rewriter again to handle RLS quals, it might
be an option to handle RLS as a part of rewriter.

At least, here is two problems. 1) System column is problematic
when SELECT statement is replaced by sub-query. 2) It makes
infinite recursion when a certain table has SELECT INSTEAD
rule with a sub-query on the same table.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [Bug] SELECT INSTEAD with sub-query

2012-10-22 Thread Kohei KaiGai
I could find out a case to cause Assert() failure during investigation of RLS...

postgres=# CREATE TABLE t1 (x int, y text);
CREATE TABLE

postgres=# CREATE RULE _RETURN AS ON SELECT TO t1 DO INSTEAD SELECT
1 AS x, 'aaa'::text AS y;
CREATE RULE

postgres=# SELECT * FROM t1;
 x |  y
---+-
 1 | aaa
(1 row)

postgres=# SELECT tableoid, * FROM t1;
TRAP: FailedAssertion(!(attno = rel-min_attr  attno =
rel-max_attr), File: initsplan.c, Line: 180)
The connection to the server was lost. Attempting reset: LOG:  server
process (PID 27345) was terminated by signal 6: Aborted
DETAIL:  Failed process was running: SELECT tableoid, * FROM t1;


The failure scenario is obvious. Reference to t1 was replaced by a sub-query
that does not have any system columns, however, the given query tries to the
column with negative attribute number. Then, it was caught on the
FailedAssertion.
How do we fix the problem? Please give us some comments.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM

2012-10-22 Thread Marko Kreen
On Wed, Oct 10, 2012 at 4:24 PM, Marko Kreen mark...@gmail.com wrote:
 The SCRAM looks good from the quick glance.

SCRAM does have weakness - the info necessary to log in
as client (ClientKey) is exposed during authentication
process.

IOW, the stored auth info can be used to log in as client,
if the attacker can listen on or participate in login process.
The random nonces used during auth do not matter,
what matters is that the target server has same StoredKey
(same password, salt and iter).

It seems this particular attack is avoided by SRP.

This weakness can be seen as feature tho - it can be
used by poolers to cache auth info and re-connect
to server later.  They need full access to stored keys still.

But it does make it give different security guaratees
depending whether SSL is in use or not.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-22 Thread Merlin Moncure
On Fri, Oct 19, 2012 at 2:55 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Oct 19, 2012 at 10:29 AM, Andrew Dunstan and...@dunslane.net wrote:
 As you can see, in the case of rewrite it takes us back 7 1/2 years. I know
 this is a *very* rough measure, but it still tends to indicate to me that
 the maintenance burden isn't terribly high.

 That's a pretty neat one-liner.  However... in my view, the real cost
 of rules is that they are hard to support as we add new features to
 SQL.  I believe we already decided to punt on making them work with
 CTEs... and maybe one other case?  I don't really remember the details
 any more, but presumably this will come up again with MERGE, and
 perhaps other cases...

Good point on the CTE (and it's correct).  I think by any reasonable
definition rules are in fact already de facto deprecated: they are not
being extended to interact with other features and the community is
advising against their use.  I don't think anybody would complain
if/when a hypothetical MERGE feature was advanced without rule
interaction.

That said, I don't think there is any reasonable argument to remove
rules.  Backwards compatibility should only be broken when it *must*
be broken.  Any 'developer interest only' standards ('grotty code',
'inelegant', 'ill advised for new code', etc) of removal are
completely specious and thus are IMSNHO irrelevant.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM

2012-10-22 Thread Greg Stark
On Sun, Oct 21, 2012 at 5:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 I don't see a problem at all with providing the snakeoil cert. In
 fact, it's quite useful.

 I see a problem with enabling it by default. Because it makes people
 think they are more secure than they are.

 I am far from an SSL expert, but I had the idea that the only problem
 with a self-signed cert is that the client can't trace it to a trusted
 cert --- so if the user took the further step of copying the cert to the
 client machines' ~/.postgresql/root.crt files, wouldn't things be just
 fine?

I'm not sure if server certs are supposed to go in the root.crt file
or somewhere else. It's a bit tricky to distribute them securely but
most people will just scp them and call that good since they ignore
the ssh host key messages anyways.

Fwiw the main problem here is that you're vulnerable to a MitM attack.
In theory we could work around that by doing something like encrypting
the ssl public key in a key based on a query provided by the user.
That query would have to include some server data that the client can
predict and that only the correct server would have access to. There
are obvious problems with this though and inventing our own security
protocol is almost certainly a bad idea even if we can fix them.

 In a browser, they will get a big fat warning every time, so they will
 know it. There is no such warning in psql. Actually, maybe we should
 *add* such a warning. We could do it in psql. We can't do it in libpq
 for everyone, but we can do it in our own tools... Particularly since
 we do print the SSL information already - we could just add a
 warning: cert not verified or something like that to the same piece
 of information.

I think we can provide a much better warning however. I think we want
something like 'WARNING: Server identity signed by unknown and
untrusted authority Snakeoil CA'

We could go even further:
INFO: Server identity ACME Debian Machine certified by Snakeoil CA
WARNING: Server identity signed by unknown and untrusted authority Snakeoil CA
HINT: Add either the server certificate or the CA certificate to
/usr/lib/ssl/certs after verifying the identity and certificate hash

SSL is notoriously hard to set up, it would go a long way to give the
sysadmin an immediate pointer to what certificates are being used and
where to find or install the CA certs. It might be worth mentioning
the GUC parameter names to control these things too.

 What happens in the other direction, ie if a client presents a
 self-signed cert that the server can't verify?

Surely that's just a failure. The server always expects client
authentication and a connection authenticated using an unverified cert
could be anyone at all. Clients traditionally didn't authenticate the
server until encrypted connections entered the picture and preventing
MitM attacks became relevant.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?

2012-10-22 Thread Greg Stark
On Mon, Oct 22, 2012 at 4:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Don't hold your breath.  There are two ways the system could implement
 the DISTINCT clause: either sort and uniq, or hashaggregate.
 hashaggregate will destroy any input ordering, so there's no value in
 using the index as input.  sort and uniq requires the input to be sorted
 by *all* the columns being distinct'ed, not just one, so again this
 index isn't useful.

We already have some bits that understand functional dependencies for
distinct/group by don't we? Do we detect that col - 'foo' is
functionally dependent on col? If so is it possible to construct a
multicolumn index that can produce an ordering like [col - 'foo',
col] which could be used to get distinct values of col in the knn
order (since the first column is functionally dependent on the second
it can be ignored for grouping purposes).

Not that we can do this now but I wonder whether a lot of the pieces
are already there and just need to be hooked up together.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-22 Thread David Johnston
 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Merlin Moncure
 Sent: Monday, October 22, 2012 6:54 PM
 To: Robert Haas
 Cc: Andrew Dunstan; Josh Berkus; Daniel Farina; pgsql-
 hack...@postgresql.org
 Subject: Re: [HACKERS] Deprecating RULES
  
 Good point on the CTE (and it's correct).  I think by any reasonable
definition
 rules are in fact already de facto deprecated: they are not being extended
to
 interact with other features and the community is advising against their
use.
 I don't think anybody would complain if/when a hypothetical MERGE feature
 was advanced without rule interaction.
 
 That said, I don't think there is any reasonable argument to remove rules.
 Backwards compatibility should only be broken when it *must* be broken.
 Any 'developer interest only' standards ('grotty code', 'inelegant', 'ill
advised
 for new code', etc) of removal are completely specious and thus are IMSNHO
 irrelevant.
 
 merlin

While I agree with this sentiment to some degree in order for the community
to thrive new developer blood needs to be introduced periodically.  Not that
this feature is particularly an issue but making the codebase easier to
learn and maintain has considerable value in its own right.

To put a different spin on things it is like CREATE RULE is a specialty
tool.  Taken that way we should strictly describe the uses-cases where
CREATE RULE behavior is well-defined and problem free.  If the end-user
isn't trying to use RULEs in exactly those cases then they are advised to
attempt another solution or send an e-mail to the list to get some expert
opinions on that particular use-case.  Known problematic uses can also be
listed to minimize the amount of not listed, what do y'all think e-mails
sent to the list.  In this setup there is some developer obligation to try
and not break those well-defined use-cases; but that exists today even if
it is not explicitly mentioned.

David J.






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-22 Thread Greg Stark
On Wed, Oct 17, 2012 at 7:48 PM, Christopher Browne cbbro...@gmail.com wrote:
 Well, replication is arguably a relevant case.

 For Slony, the origin/master node never cares about logged changes - that
 data is only processed on replicas.  Now, that's certainly a little weaselly
 - the log data (sl_log_*) has got to get read to get to the replica.

Well this is a clever way for Slony to use existing infrastructure to
get data into the WAL. But wouldn't it be more logical for an in-core
system to just annotate the existing records with enough information
to replay them logically? Instead of synthesizing inserts into an
imaginary table containing data that can be extracted to retrieve info
about some other record, just add the info needed to the relevant
record.

The minimum needed for DML afaict is DELETE and UPDATE records need
the primary key of the record being deleted and updated. It might make
sense to include the whole tupledesc or at least key parts of it like
the attlen and atttyp array so that replay can be more robust. But the
logical place for this data, it seems to me, is *in* the update or
insert record that already exists. Otherwise managing logical
standbies will require a whole duplicate set of infrastructure to keep
track of what has and hasn't been replayed. For instance what if an
update record is covered by a checkpoint but the logical record falls
after the checkpoint and the system crashes before writing it out?



-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Bug] SELECT INSTEAD with sub-query

2012-10-22 Thread Tom Lane
Kohei KaiGai kai...@kaigai.gr.jp writes:
 postgres=# CREATE TABLE t1 (x int, y text);
 CREATE TABLE

 postgres=# CREATE RULE _RETURN AS ON SELECT TO t1 DO INSTEAD SELECT
 1 AS x, 'aaa'::text AS y;
 CREATE RULE

 postgres=# SELECT * FROM t1;
  x |  y
 ---+-
  1 | aaa
 (1 row)

 postgres=# SELECT tableoid, * FROM t1;
 TRAP: FailedAssertion(!(attno = rel-min_attr  attno =
 rel-max_attr), File: initsplan.c, Line: 180)

Huh.  I'm amazed nobody noticed this before.  It's really a bug in the
convert table to view logic: a view has no system columns so we ought
to remove the pg_attribute entries for the system columns, but we don't.

Given that we can't retroactively fix the pg_attribute rows for existing
views, I guess that we'll also have to put in a defense in the parser to
not believe that views have any system columns, even if pg_attribute
claims they do.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers