Re: [HACKERS] [WIP PATCH] for Performance Improvement in Buffer Management
On Saturday, October 20, 2012 11:07 PM Jeff Janes wrote: On Fri, Oct 19, 2012 at 11:00 PM, Amit kapila amit.kap...@huawei.com wrote: Robert wrote an accounting patch a while ago that tallied how often a buffer was cleaned but then reclaimed for the same page before being evicted. But now I can't find it. If you can find that thread, there might be some benchmarks posted to it that would be useful. In my first level search, I am also not able to find it. But now I am planning to check all mails of Robert Haas on PostgreSQL site (which are approximately 13,000). If you can tell me how long ago approximately (last year, 2 yrs back, ..) or whether such a patch is submitted to any CF or was just discussed in mail chain, then it will be little easier for me. It was just an instrumentation patch for doing experiments, not intended for commit. I've tracked it down to the thread Initial 9.2 pgbench write results. But I don't think it applies to the -S benchmark, because it records when the background writer cleaned a buffer by finding it dirty and writing it out to make it clean, while in this situation we would need something more like either made the buffer clean and reusable, observed the buffer to already be clean and reusable Do you think an instrumentation patch which can give us how many times a buffer is found by Clock Sweep and how many times it's found from freelist will be useful? I have written something on similar lines when I was testing this patch to find out how many times this patch can avoid clock sweep. My observation was that although the new implementation saves many cycles of clock sweep, but still with shared buffers upto 2,2.5G there is no visible performance gain. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump --split patch
Hi, Now that the (at least as far as I know) last ordering problem in pg_dump has been solved [1], I'm going to attempt resurrecting this old thread. It seemed to me that the biggest objections to this patch in the old discussions were directed at the implementation, which I have tried to improve. The attached patch implements the actual splitting in a new backup format. The general output scheme looks like this: schemaname/OBJECT_TYPES/object_name.sql, but there are some exceptions. Overloaded functions are dumped into the same file. Object names are encoded into the POSIX Portable Filename Character Set ([a-z0-9._-]) by replacing any characters outside that set with an underscore. Restoring the dump is supported through an index.sql file containing statements which include (through \i) the actual object files in the dump directory. Any thoughts? Objections on the idea or the implementation? [1]: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7b583b20b1c95acb621c71251150beef958bb603 Regards, Marko Tiikkaja *** a/src/bin/pg_dump/Makefile --- b/src/bin/pg_dump/Makefile *** *** 19,25 include $(top_builddir)/src/Makefile.global override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS) OBJS= pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \ ! pg_backup_null.o pg_backup_tar.o \ pg_backup_directory.o dumpmem.o dumputils.o compress_io.o $(WIN32RES) KEYWRDOBJS = keywords.o kwlookup.o --- 19,25 override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS) OBJS= pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \ ! pg_backup_null.o pg_backup_tar.o pg_backup_split.o \ pg_backup_directory.o dumpmem.o dumputils.o compress_io.o $(WIN32RES) KEYWRDOBJS = keywords.o kwlookup.o *** a/src/bin/pg_dump/pg_backup.h --- b/src/bin/pg_dump/pg_backup.h *** *** 50,56 typedef enum _archiveFormat archCustom = 1, archTar = 3, archNull = 4, ! archDirectory = 5 } ArchiveFormat; typedef enum _archiveMode --- 50,57 archCustom = 1, archTar = 3, archNull = 4, ! archDirectory = 5, ! archSplit = 6 } ArchiveFormat; typedef enum _archiveMode *** a/src/bin/pg_dump/pg_backup_archiver.c --- b/src/bin/pg_dump/pg_backup_archiver.c *** *** 2125,2130 _allocAH(const char *FileSpec, const ArchiveFormat fmt, --- 2125,2134 case archTar: InitArchiveFmt_Tar(AH); break; + + case archSplit: + InitArchiveFmt_Split(AH); + break; default: exit_horribly(modulename, unrecognized file format \%d\\n, fmt); *** a/src/bin/pg_dump/pg_backup_archiver.h --- b/src/bin/pg_dump/pg_backup_archiver.h *** *** 369,374 extern void InitArchiveFmt_Custom(ArchiveHandle *AH); --- 369,375 extern void InitArchiveFmt_Null(ArchiveHandle *AH); extern void InitArchiveFmt_Directory(ArchiveHandle *AH); extern void InitArchiveFmt_Tar(ArchiveHandle *AH); + extern void InitArchiveFmt_Split(ArchiveHandle *AH); extern bool isValidTarHeader(char *header); *** /dev/null --- b/src/bin/pg_dump/pg_backup_split.c *** *** 0 --- 1,1063 + /*- + * + * pg_backup_split.c + * + * A split format dump is a directory, which contains all database objects + * separated into .sql files, and an index.sql file with psql statements + * to allow restoring the separated objects. + * + *- + */ + + #include postgres_fe.h + #include libpq-fe.h + #include libpq/libpq-fs.h + #include pg_backup_archiver.h + #include dumpmem.h + #include dumputils.h + + #include dirent.h + #include sys/stat.h + + typedef struct + { + char *filename; /* filename excluding the directory (basename) */ + DumpId dumpId; /* dump id of the TocEntry */ + } lclTocEntry; + + typedef struct + { + /* +* Our archive location. This is basically what the user specified as his +* backup file but of course here it is a directory. +*/ + char *directory; + + FILE *dataFH; /* currently open data file */ + + lclTocEntry **sortedToc;/* array of toc entires sorted by (filename, dumpId) */ + } lclContext; + + /* translator: this is a module name */ + static const char *modulename = gettext_noop(split archiver); + + + /* prototypes for private functions */ + static void _ArchiveEntry(ArchiveHandle *AH, TocEntry *te); + static void _StartData(ArchiveHandle *AH, TocEntry *te); + static void _EndData(ArchiveHandle *AH, TocEntry *te); + static size_t _WriteData(ArchiveHandle *AH, const void *data,
Re: [HACKERS] Deprecating RULES
Robert Haas robertmh...@gmail.com writes: The problems with MERGE are mostly around concurrency, as far as I can tell. I can't see why RULEs would have anything to do with it - except that I don't see how MERGE can sanely support rules, and even if we find a way to make it do that, anyone already using RULEs will need to adjust them to support MERGE. I'm not sure I have a horribly well-thought-out position on the underlying issue here - I'm kind of vacillating back and forth - but I do think one of the problems with RULEs is that they are too tied to particular command names. Adding any new commands that can select or modify data - be it MERGE, UPSERT, or whatever - is going to cause trouble both for implementors and for people relying on the feature. And triggers (or anything else) would be better on that score because ...? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?
Hello I should to search distinct values based on similarity postgres=# explain select nazobce, nazobce - 'Benešov' from obce order by nazobce - 'Benešov' limit 10 ; QUERY PLAN --- Limit (cost=0.00..0.86 rows=10 width=10) - Index Scan using obce_nazobce_idx on obce (cost=0.00..1433.14 rows=16644 width=10) Order By: (nazobce - 'Benešov'::text) (3 rows) Time: 0.576 ms but using DISTINCT breaks KNN searching optimization postgres=# explain select distinct nazobce, nazobce - 'Benešov' from obce order by nazobce - 'Benešov' limit 10 ; QUERY PLAN - Limit (cost=600.45..600.47 rows=10 width=10) - Sort (cost=600.45..613.80 rows=5341 width=10) Sort Key: ((nazobce - 'Benešov'::text)) - HashAggregate (cost=418.27..485.03 rows=5341 width=10) - Seq Scan on obce (cost=0.00..335.05 rows=16644 width=10) (5 rows) Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.3] Row-Level Security
On Thu, Oct 18, 2012 at 2:19 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Kohei KaiGai escribió: The revised patch fixes the problem that Daen pointed out. Robert, would you be able to give this latest version of the patch a look? Yeah, sorry I've been completely sidelined this CommitFest. It's been a crazy couple of months. Prognosis for future craziness reduction uncertain. Comments: The documentation lists several documented limitations that I would like to analyze a little bit. First, it says that row-level security policies are not applied on UPDATE or DELETE. That sounds downright dangerous to me. Is there some really compelling reason we're not doing it? Second, it says that row-level security policies are not currently applied on INSERT, so you should use a trigger, but implies that this will change in the future. I don't think we should change that in the future; I think relying on triggers for that case is just fine. Note that it could be an issue with the post-image for UPDATES, as well, and I think the trigger solution is similarly adequate to cover that case. With respect to the documented limitation regarding DECLARE/FETCH, what exactly will happen? Can we describe this a bit more clearly rather than just saying the behavior will be unpredictable? It looks suspiciously as if the row-level security mode needs to be saved and restored in all the same places we call save and restore the user ID and security context. Is there some reason the row-level-security-enabled flag shouldn't just become another bit in the security context? Then we'd get all of this save/restore logic mostly for free. ATExecSetRowLevelSecurity() calls SetRowLevelSecurity() or ResetRowLevelSecurity() to update pg_rowlevelsec, but does the pg_class update itself. I think that all of this logic should be moved into a single function, or at least functions in the same file, with the one that only updates pg_rowlevelsec being static and therefore not able to be called from outside the file. We always need the pg_class update and the pg_rowlevelsec update to happen together, so it's not good to have an exposed function that does one of those updates but not the other. I think the simplest thing is just to move ATExecSetRowLevelSecurity to pg_rowlevelsec.c and rename it to SetRowLevelSecurity() and then give it two static helper functions, say InsertPolicyRow() and DeletePolicyRow(). I think it would be good if Tom could review the query-rewriting parts of this (viz rowlevelsec.c) as I am not terribly familiar with this machinery, and of course anything we get wrong here will have security consequences. At first blush, I'm somewhat concerned about the fact that we're trying to do this after query rewriting; that seems like it could break things. I know KaiGai mentioned upthread that the rewriter won't be rerun if the plan is invalidated, but (1) why is that OK now? and (2) if it is OK now, then why is it OK to do rewriting of the RLS qual - only - after rewriting if all of the rest of the rewriting needs to happen earlier? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
On Mon, Oct 22, 2012 at 8:57 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: The problems with MERGE are mostly around concurrency, as far as I can tell. I can't see why RULEs would have anything to do with it - except that I don't see how MERGE can sanely support rules, and even if we find a way to make it do that, anyone already using RULEs will need to adjust them to support MERGE. I'm not sure I have a horribly well-thought-out position on the underlying issue here - I'm kind of vacillating back and forth - but I do think one of the problems with RULEs is that they are too tied to particular command names. Adding any new commands that can select or modify data - be it MERGE, UPSERT, or whatever - is going to cause trouble both for implementors and for people relying on the feature. And triggers (or anything else) would be better on that score because ...? Well, my thought was that a trigger - at least a row-level trigger - can presumably be fired on the basis of whether an individual row is being insert or updated, rather than on whether the statement is named INSERT or UPDATE. If that's not correct, we've got some head-scratching to do... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 8/8] Introduce wal decoding via catalog timetravel
Simon Riggs wrote: Greg Stark st...@mit.edu wrote: Tom Lane t...@sss.pgh.pa.us wrote: Isn't there an even more serious problem, namely that this assumes *all* transactions are serializable? Do you mean in terms of the serializable transaction isolation level, or something else? I haven't read the patches, but I've been trying to follow the discussion and I don't recall any hint of basing this on serializable transactions on each source. Of course, when it comes down to commits, both where a change is committed and where the work is copied, there must be a commit order; and with asynchronous work where data isn't partitioned such that there is a single clear owner for each partition there will be conflicts which must be resolved. I don't get the impression that this point has been lost on Simon and Andres. What happens when they aren't? Or even just that the effective commit order is not XID order? Firstly, I haven't read the code but I'm confident it doesn't make the elementary error of assuming commit order == xid order. I assume it's applying the reassembled transactions in commit order. Same here. I don't think it assumes the transactions are serializable because it's only concerned with writes, not reads. So the transaction it's replaying may or may not have been able to view the data written by other transactions that commited earlier but it doesn't matter when trying to reproduce the effects using constants. IIRC, the developers of this feature have explicitly said that they will defer any consideration of trying to extend serializable transaction isolation behavior to a multi-server basis until after they have other things working. (Frankly, to do otherwise would not be sane.) It appears to me that it can't be managed in a general sense without destroying almost all the advantages of multi-master replication, at least (as I said before) where data isn't partitioned such that there is a single clear owner for each partition. Where such partitioning is present and there are data sets maintained exclusively by serializable transactions, anomaly-free reads of the data could be accomplished by committing transactions on the replicas in apparent order of execution rather than commit order. Apparent order of execution must take both commit order and read-write dependencies into consideration. The data written by this transaction is either written or not when the commit happens and it's all written or not at that time. Even in non-serializable mode updates take row locks and nobody can see the data or modify it until the transaction commits. As with read-only transactions and hot standbys, the problem comes in when a transaction commits and is replicated while a transction remains uncommitted which is basing its updates on the earlier state of the data. It gets even more exciting with MMR since the transaction working with the old version of the data might be on a different machine, on another continent. With certain types of workloads, it seems to me that it could get pretty crazy if certain closely-related actions are not kept within a single database (like controlling the active batch and adding items to a batch). In the wild, half-baked, hand-wavey suggestions department -- maybe there should be some consideration of a long-term way within MMR to direct activities to certain logical nodes, each of which could be mapped to a single physical node at any one time. Basically, to route a request through the MMR network to the current logical node for handling something, and have the effects ripple back out through all nodes. This uses Commit Serializability, which is valid, as you say. Well, it is a type of concurrency control with a name and a definition, if that's what you mean. I agree it provides sufficient guarantees to create a workable MMR system, if you have adequate conflict resolution. My concern is that it not be confused with serializability in the mathematical sense or in the sense of transaction isolation levels. In general on this thread, when I've seen the terms serializable and serializability I haven't been clear on whether the words are being used in their more general sense as words in the English language, or in a more particular technical sense. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecations in authentication
Magnus, all, * Magnus Hagander (mag...@hagander.net) wrote: On Thu, Oct 18, 2012 at 5:59 PM, Robert Haas robertmh...@gmail.com wrote: That seems like a sufficiently long deprecation window, but is gssapi a full substitute for krb5? I don't really have a strong opinion on this, not being a user myself. I'm pretty sure that it is. Stephen, you usually have comments about the Kerberos stuff - want to comment on this one? :) The biggest risk that I can think of regarding deprecating krb5 would be platforms (if any still exist...) which don't have GSSAPI. Is it possible to see that from the buildfarm information or from the configure results that people have for any strange/different platforms out there? The other question would be if we think anyone's actually using krb5 on those platforms and/or would people in those situations be willing/able to move to a different library which supports GSSAPI. I'm all for deprecating krb5 myself, but I wouldn't want to break things for people without good cause. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM
On 10/22/2012 10:18 AM, Robert Haas wrote: On Sun, Oct 21, 2012 at 11:02 AM, Martijn van Oosterhout klep...@svana.org wrote: It bugs me every time you have to jump through hoops and get red warnings for an unknown CA, whereas no encryption whatsoever is treated as fine while being actually even worse. +1. Amen, brother. Not really, IMNSHO. The difference is that an unencrypted session isn't pretending to be secure. In any case, it doesn't seem too intrusive for us to warn, at least in psql, with something like: SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256) Host Certificate Unverified If people want to get more paranoid they can always set PGSSLMODE to verify-ca or verify-full. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM
On 10/12/12 3:44 PM, Stephen Frost wrote: wrt future-proofing, I don't like the #-of-iterations approach. There are a number of examples of how to deal with multiple encryption types being supported by a protocol, I'd expect hash'ing could be done in the same way. For example, Negotiate, SSL, Kerberos, GSSAPI, all have ways of dealing with multiple encryption/hashing options being supported. Multiple iterations could be supported through that same mechanism (as des/des3 were both supported by Kerberos for quite some time). In general, I think it's good to build on existing implementations where possible. Perhaps we could even consider using something which already exists for this? Sounds like SASL to me. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] pg_ping utility
On Sun, Oct 21, 2012 at 6:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: Phil Sorber p...@omniti.com writes: Here is the new patch. I renamed the utility from pg_ping to pingdb to go along with the naming convention of src/bin/scripts. Uh, no, that's not a step forward. Leaving out a pg prefix from those script names is universally agreed to have been a mistake. We've not felt that changing the legacy names is worth the amount of pain it'd cause, but that doesn't mean that we should propagate the mistake into brand new executable names. regards, tom lane Ok. I asked about this and got no response so I assume there were no strong opinions. I have reverted to the pg_ping name. Patches attached. pg_ping_bin_v2.diff Description: Binary data pg_ping_docs_v2.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?
2012/10/22 Tom Lane t...@sss.pgh.pa.us: Pavel Stehule pavel.steh...@gmail.com writes: but using DISTINCT breaks KNN searching optimization postgres=# explain select distinct nazobce, nazobce - 'Benešov' from obce order by nazobce - 'Benešov' limit 10 Don't hold your breath. There are two ways the system could implement the DISTINCT clause: either sort and uniq, or hashaggregate. hashaggregate will destroy any input ordering, so there's no value in using the index as input. sort and uniq requires the input to be sorted by *all* the columns being distinct'ed, not just one, so again this index isn't useful. You could get a plan using the index if you only wanted the - output column, eg contrib_regression=# explain select distinct t - 'foo' from test_trgm order by t - 'foo' limit 10; QUERY PLAN - Limit (cost=0.00..0.87 rows=10 width=12) - Unique (cost=0.00..86.75 rows=1000 width=12) - Index Scan using ti on test_trgm (cost=0.00..84.25 rows=1000 width=12) Order By: (t - 'foo'::text) (4 rows) Perhaps it would be close enough to what you want to use DISTINCT ON: contrib_regression=# explain select distinct on( t - 'foo') *,t - 'foo' from test_trgm order by t - 'foo' limit 10; QUERY PLAN - Limit (cost=0.00..0.87 rows=10 width=12) - Unique (cost=0.00..86.75 rows=1000 width=12) - Index Scan using ti on test_trgm (cost=0.00..84.25 rows=1000 width=12) Order By: (t - 'foo'::text) (4 rows) regards, tom lane good tip - it's working thank you Regards Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys
2012/10/22 Tom Lane t...@sss.pgh.pa.us: I wrote: I tested, and indeed this seems to work: CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2); and it's perfectly sensible from an English-grammar standpoint too. If we take that, how would we spell the table-constraint case exactly? Grammatically I'd prefer FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES Are people happy with these syntax proposals, or do we need some other color for the bikeshed? I am ok Pavel regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys
On 10/22/2012 12:08 PM, Tom Lane wrote: I wrote: I tested, and indeed this seems to work: CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2); and it's perfectly sensible from an English-grammar standpoint too. If we take that, how would we spell the table-constraint case exactly? Grammatically I'd prefer FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES Are people happy with these syntax proposals, or do we need some other color for the bikeshed? I can live with it, although the different spelling is slightly jarring. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys
On Monday, October 22, 2012 06:08:32 PM Tom Lane wrote: I wrote: I tested, and indeed this seems to work: CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2); and it's perfectly sensible from an English-grammar standpoint too. If we take that, how would we spell the table-constraint case exactly? Grammatically I'd prefer FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES Are people happy with these syntax proposals, or do we need some other color for the bikeshed? Except that I'd prefer a WHERE in the table-constraint case as well for consistencies sake I am unsurprisingly happy with the proposal. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.3] Row-Level Security
Robert Haas robertmh...@gmail.com writes: The documentation lists several documented limitations that I would like to analyze a little bit. First, it says that row-level security policies are not applied on UPDATE or DELETE. That sounds downright dangerous to me. Is there some really compelling reason we're not doing it? [ blink... ] Isn't that a security hole big enough for a Mack truck? UPDATE tab SET foo = foo RETURNING *; sucks out all the data just fine, if RLS doesn't apply to it. Having said that, I fear that sensible row-level security for updates is at least one order of magnitude harder than sensible row-level security for selects. We've speculated about how to define that in the past, IIRC, but without any very satisfactory outcome. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys
On 10/22/2012 12:13 PM, Andres Freund wrote: On Monday, October 22, 2012 06:08:32 PM Tom Lane wrote: I wrote: I tested, and indeed this seems to work: CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2); and it's perfectly sensible from an English-grammar standpoint too. If we take that, how would we spell the table-constraint case exactly? Grammatically I'd prefer FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES Are people happy with these syntax proposals, or do we need some other color for the bikeshed? Except that I'd prefer a WHERE in the table-constraint case as well for consistencies sake I am unsurprisingly happy with the proposal. That would look odd too, especially if the array isn't the last element in the FK: FOREIGN KEY (foo, WHERE EACH ELEMENT OF bar, baz) REFERENCES cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Database object names and libpq in UTF-8 locale on Windows
Hi all, I would appreciate some help or comments on this topic... Is this the wrong mailing list to ask such question? Seb On 10/18/2012 10:15 AM, Sebastien FLAESCH wrote: Hello, Using PostgreSQL 9.2.1 on Windows, I am facing a strange character set issue with a UTF-8 database. Maybe this is expected, but want to be sure that I am not missing something. On Windows, I have created a database with: ENCODING = 'UTF-8' LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United States.1252' I am using libpq with a simple C program. The program handles UTF-8 strings. While it's not a traditional UNICODE Windows application using WCHAR, it should be possible to send and get back UTF-8 data with PostgreSQL. Interacting with the Windows system in the system locale is another story, but from a pure C / SQL / libpq point of view, as long as the PostgreSQL client encoding is properly defined to UTF-8, it should work, and it does mainly: - I can use UTF-8 string constants in my queries. - I can pass UTF-8 data to the database with parameterized queries. - I can fetch UTF-8 data from the database. - I can create db object names with UTF-8 characters. But the db object names must be specified with double quotes: When I do not use quoted db object names, I get a strange problem. The table is created, I can use it in my program, I can even use it in the pgAdmin query tool, but in the pgAdmin db browser, there is no table name displayed in the treeview... Further, when selecting schema information from pg_class, I can see that the table exists, but there is nothing displayed in the relname attribute... It appears that the problem disappears when using a C collation are char type: ENCODING = 'UTF-8' LC_COLLATE = 'C' LC_CTYPE = 'C' I suspect this has something to do with the fact that non-quoted identifiers are converted to lowercase, and because my LC_CTYPE is English_United States.1252, the conversion to lowercase fails... But: - why does PostgreSQL accept to create the table with invalid UTF-8 characters? - why can I make queries in the query tool with the UTF-8 table name? (note that show client_encoding in the query tool gives me UNICODE - is this the same as UTF-8?) So is there a bug, or do I have to use a C collation and char type for UTF-8 databases on Windows? I know that UTF-8 is not supported by the Windows C/POSIX library (setlocale, and co). Does it mean that it's not realistic to use UTF-8 encoding for PostgreSQL databases on Windows...? Thanks for reading. Sebastien FLAESCH Four Js Development Tools -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Database object names and libpq in UTF-8 locale on Windows
On 10/22/2012 12:53 PM, Sebastien FLAESCH wrote: [Issues with unquoted utf8 identifiers in Windows 1252 locale] I suspect this has something to do with the fact that non-quoted identifiers are converted to lowercase, and because my LC_CTYPE is English_United States.1252, the conversion to lowercase fails... Quite possibly. The code comment says this: /* * SQL99 specifies Unicode-aware case normalization, which we don't yet * have the infrastructure for. Instead we use tolower() to provide a * locale-aware translation. However, there are some locales where this * is not right either (eg, Turkish may do strange things with 'i' and * 'I'). Our current compromise is to use tolower() for characters with * the high bit set, and use an ASCII-only downcasing for 7-bit * characters. */ For now your best bet is probably not to use UTF8 non-ascii chars or to quote the identifiers. Given we're calling to_lower() on a single byte in the code referred to, should we even be doing that when we have a multi-byte encoding and the high bit is set? Aside: I'd love to fix up our treatment of identifiers, but there is probably a LOT of very tedious work involved. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?
On Mon, Oct 22, 2012 at 11:57 AM, Tom Lane t...@sss.pgh.pa.us wrote: Pavel Stehule pavel.steh...@gmail.com writes: but using DISTINCT breaks KNN searching optimization postgres=# explain select distinct nazobce, nazobce - 'Benešov' from obce order by nazobce - 'Benešov' limit 10 Don't hold your breath. There are two ways the system could implement the DISTINCT clause: either sort and uniq, or hashaggregate. hashaggregate will destroy any input ordering, so there's no value in using the index as input. Isn't that an implementation limitation though, rather than a fundamental limitation? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.3] Row-Level Security
On Mon, Oct 22, 2012 at 12:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: The documentation lists several documented limitations that I would like to analyze a little bit. First, it says that row-level security policies are not applied on UPDATE or DELETE. That sounds downright dangerous to me. Is there some really compelling reason we're not doing it? [ blink... ] Isn't that a security hole big enough for a Mack truck? UPDATE tab SET foo = foo RETURNING *; sucks out all the data just fine, if RLS doesn't apply to it. Yep. Having said that, I fear that sensible row-level security for updates is at least one order of magnitude harder than sensible row-level security for selects. We've speculated about how to define that in the past, IIRC, but without any very satisfactory outcome. Uh, I don't agree. SELECT and DELETE are pretty much identical cases. UPDATE needs all the same stuff that those two cases need, plus it has an additional problem that it shares with INSERT - namely, someone might insert a tuple that they cannot see or update a tuple such that they can no longer see it. However, both of those problems can be handled via triggers, for now and maybe forever. In contrast, the problem that SELECT has - which UPDATE and DELETE also share - namely, of rows being visible that should not be - is not nearly so susceptible to that approach, both for performance reasons and because there's no such thing as a trigger on SELECT. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM
On Mon, Oct 22, 2012 at 10:57 AM, Andrew Dunstan and...@dunslane.net wrote: On 10/22/2012 10:18 AM, Robert Haas wrote: On Sun, Oct 21, 2012 at 11:02 AM, Martijn van Oosterhout klep...@svana.org wrote: It bugs me every time you have to jump through hoops and get red warnings for an unknown CA, whereas no encryption whatsoever is treated as fine while being actually even worse. +1. Amen, brother. Not really, IMNSHO. The difference is that an unencrypted session isn't pretending to be secure. In any case, it doesn't seem too intrusive for us to warn, at least in psql, with something like: SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256) Host Certificate Unverified Well, that change wouldn't bother me at all; in fact, I like it. But Firefox, for example, makes me do three or four clicks every time I got to a website with an invalid SSL certificate, whereas a web site that does not use SSL requires no clicks at all. What's the sense in that? If we imagine that all activity is user-initiated - that is, the user is always careful to ask for SSL when and only when they need a higher level of security - then that's pretty sensible. But in fact the world doesn't work that way. Most web pages are downloaded automatically when you click on a link, and you don't normally look to see whether SSL is in use unless you have a security concern (e.g. because you are logging into your bank's web site). If somebody went and trojaned my bank's web page, they wouldn't need to break the SSL certificate; they could just remove SSL from the login page altogether. Odds are very good that 95% of people wouldn't notice. I think it's great to have a full-paranoia mode where anything not kosher on the SSL connection is grounds for extreme panic. But it shouldn't be the default. What Ubuntu is doing does not solve every problem, but it does solve some problems, and we shouldn't go out of our way to break it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys
Robert Haas robertmh...@gmail.com writes: On Mon, Oct 22, 2012 at 12:08 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: I tested, and indeed this seems to work: CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2); and it's perfectly sensible from an English-grammar standpoint too. If we take that, how would we spell the table-constraint case exactly? Grammatically I'd prefer FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES Are people happy with these syntax proposals, or do we need some other color for the bikeshed? Well, I can't say I'm very happy with the discrepancy between the two syntaxes, but I guess I'm in the minority. Still, I can't help but think it's going to be confusing and hard to remember. If we don't get complaints about it, I'll take that as evidence that the feature isn't being used, rather than evidence that the syntax is satisfactory. I'm not thrilled with the inconsistency either, but given the constraints we're under, it seems like the best we can do. (I feel, as Andrew does, that shoving WHERE into the table-constraint syntax would not be an improvement; but the column-constraint syntax really needs to start with a fully-reserved word). Have you got a better proposal? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys
On Mon, Oct 22, 2012 at 12:08 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: I tested, and indeed this seems to work: CREATE TABLE t1 (c int[] WHERE EACH ELEMENT REFERENCES t2); and it's perfectly sensible from an English-grammar standpoint too. If we take that, how would we spell the table-constraint case exactly? Grammatically I'd prefer FOREIGN KEY (foo, EACH ELEMENT OF bar) REFERENCES Are people happy with these syntax proposals, or do we need some other color for the bikeshed? Well, I can't say I'm very happy with the discrepancy between the two syntaxes, but I guess I'm in the minority. Still, I can't help but think it's going to be confusing and hard to remember. If we don't get complaints about it, I'll take that as evidence that the feature isn't being used, rather than evidence that the syntax is satisfactory. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
[I'm replying to Robert's message only because it is the latest on the thread; I'm actually kinda replying to the whole thread in general.] When catching up on a backlog, one would hope that any thread comprising more than 5% of said backlog would be more constructive. :-( As someone coming in late with no skin in the game, here are my observations: (1) A suggestion was made by someone who was unaware of any actual productive uses of rules (outside of the red herring of the internal implementation detail regarding views -- which many other products manage to provide without rules) that we clean up what was assumed to be old baggage. I viewed the suggestion as having been made in more or less the same spirit as suggesting cleaning up include directives which weren't really needed or eliminating the storage manager layer: sort of a bother, unfortunately some risk, but resulting in cleaner and more maintainable code in the long run. (2) My initial gut reaction to the suggestion was positive, as my only attempt at using rules resulted in some very astonishing and dangerous behavior in testing. When I asked about it I seem to remember being told that rules were an old legacy feature which had no real use and would generally bite you badly when an unexpected type of query was run against the table with the rule. This matched my later experiences of seeing people ask questions when they were bitten or having problems getting rules to work as intended. (3) A number of people then responded with claims that rules were useful, but when those ignorant of such uses were curious about examples, were either given hand-wavey descriptions or angry-sounding challenges to prove that there were no such uses. (4) Subsequent discussion has produced a few shadowy hints at what such uses look like, with the most concrete being a one-time load of partitions, for which rules are apparently one or two orders of magnitude faster than FOR EACH ROW truggers. There was also a mention of writing to a log table being easier. Out of 100+ messages on this thread, I can't recall anything else that wasn't pretty vague. (5) Even some of those opposing deprecation say they would like to see rules go away eventually, once all of the (unspecified) uses have better alternatives. (6) There has been an assertion that it is impossible for the people on the -hackers list to properly identify and enumerate the valid real-world use-cases for rules; that we need to draw such information from a wider group. (7) There has been an aknowledgement that the documentation neither makes clear where rules might really be useful, nor how they can produce surprising results, including eating data. (Brainz aside, I assume we can all agree that a rule can surprise you by eating *data* you didn't expect it to?) I can't think of anything I got out of the thread beyond the above. Given the above, it seems that the first priority should be doing something about the documentation. Since as far as I can tell nobody has *any* trouble coming up with dangerous uses of rules, the hold-up on getting anything else done is in getting a wide sampling of appropriate and safe usage. I invite anyone who thinks rules should stay permanently to write a blog entry on Why Rules Are a Cool Feature. In particular, I would love to see it include useful examples of what Andrew calls non-trivial rules on a blog page he wrote: http://blog.rhodiumtoad.org.uk/2010/06/21/the-rule-challenge/ Failing that, how do we collect a broad enough sampling of current usage to be sure we know when we have alternatives for every current valid usage? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?
Robert Haas robertmh...@gmail.com writes: On Mon, Oct 22, 2012 at 11:57 AM, Tom Lane t...@sss.pgh.pa.us wrote: Don't hold your breath. There are two ways the system could implement the DISTINCT clause: either sort and uniq, or hashaggregate. hashaggregate will destroy any input ordering, so there's no value in using the index as input. Isn't that an implementation limitation though, rather than a fundamental limitation? Perhaps, but it's not a simple one to surmount, and I'm dubious about putting the amount of work that'd be required into such a corner case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans
On Thu, Oct 18, 2012 at 7:42 PM, Claudio Freire klaussfre...@gmail.com wrote: Fun. That didn't take long. With the attached anti-sequential scan patch, and effective_io_concurrency=8: QUERY PLAN - GroupAggregate (cost=0.00..4149039.04 rows=90257289 width=4) (actual time=26.964..84299.789 rows=9001 loops=1) - Index Only Scan Backward using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..2795179.71 rows=90257289 width=4) (actual time=26.955..62761.774 rows=9001 loops=1) Index Cond: ((aid = 1000) AND (aid = 2)) Heap Fetches: 0 Total runtime: 87170.355 ms I/O thoughput 22MB/s (twice as fast) I/O utilization 95% (I was expecting 100% but... hey... good enough) With e_i_c=24, it gets to 100% utilization and 30MB/s (that's 3 times faster). So, I'd like to know what you think, but maybe for back-sequential scans, prefetch should be set to a multiple (ie: x24) of e_i_c, in order to exploit read request merges. Earlier patch had a regression for heap-including scans backwards with RAID, so I made the back-sequential optimization index-only-only and now I can find no regression. Make check runs fine, btw. I hope I'm not talking to myself. postgresql-git-bt_prefetch_backseq.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM
On Fri, Oct 12, 2012 at 10:47 PM, Stephen Frost sfr...@snowman.net wrote: * Marko Kreen (mark...@gmail.com) wrote: As it works only on connect time, it can actually be secure, unlike user switching with SET ROLE. I'm guessing your issue with SET ROLE is that a RESET ROLE can be issued later..? If so, I'd suggest that we look at fixing that, but realize it could break poolers. For that matter, I'm not sure how the proposal to allow connections to be authenticated as one user but authorized as another (which we actually already support in some cases, eg: peer) *wouldn't* break poolers, unless you're suggesting they either use a separate connection for every user, or reconnect every time, both of which strike me as defeating a great deal of the point of having a pooler in the first place... The point of pooler is to cache things. The TCP connection is only one thing to be cached, all the backend-internal caches are as interesting - prepared plans, compiled functions. The fact that on role reset you need to drop all those things is what is breaking pooling. Of course, I'm speaking only about high-performance situations. Maybe there are cases where indeed the authenticated TCP connection is only interesting to be cached. Eg. with dumb client with raw sql only, where there is nothing to cache in backend. But it does not seem like primary scenario we should optimize for. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
On 10/19/12 1:26 PM, Josh Berkus wrote: What I'm saying is, we'll get nowhere promoting an application queue which is permanently inferior to existing, popular open source software. My advice: Forget about the application queue aspects of this. Focus on making it work for replication and matviews, which are already hard use cases to optimize. If someone can turn this feature into the base for a distributed queueing system later, then great. But let's not complicate this feature by worrying about a use case it may never fulfill. And as someone else mentioned... we should call this a stream and not a queue, since this would be lacking in many queue features. It certainly sounds like a useful framework to have. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys
On Mon, Oct 22, 2012 at 2:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm not thrilled with the inconsistency either, but given the constraints we're under, it seems like the best we can do. (I feel, as Andrew does, that shoving WHERE into the table-constraint syntax would not be an improvement; but the column-constraint syntax really needs to start with a fully-reserved word). Have you got a better proposal? Well, I think if that's the best we can do, you original proposal of ditching the column constraint syntax altogether might be for the best. I wasn't too excited about that before, but I think having two different syntaxes is going to be even worse. In some ways, it's actually sort of sensible, because the referring side isn't really the column itself; it's some value extracted therefrom. You can imagine other variants of that as well, such as the recently-suggested FOREIGN KEY ((somecol).member_name) REFERENCES othertab (doohicky) Now, what would the column-constraint version of that look like? Is it even sensible to think that there SHOULD be a column-constraint version of that? I'm not convinced it is sensible, so maybe decreeing that the table constraint version must be used to handle all non-trivial cases is more sensible than I initially thought. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys
Robert Haas robertmh...@gmail.com writes: Well, I think if that's the best we can do, you original proposal of ditching the column constraint syntax altogether might be for the best. I wasn't too excited about that before, but I think having two different syntaxes is going to be even worse. In some ways, it's actually sort of sensible, because the referring side isn't really the column itself; it's some value extracted therefrom. You can imagine other variants of that as well, such as the recently-suggested FOREIGN KEY ((somecol).member_name) REFERENCES othertab (doohicky) Now, what would the column-constraint version of that look like? Is it even sensible to think that there SHOULD be a column-constraint version of that? I'm not convinced it is sensible, so maybe decreeing that the table constraint version must be used to handle all non-trivial cases is more sensible than I initially thought. I could easily go with that ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?
2012/10/22 Tom Lane t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com writes: On Mon, Oct 22, 2012 at 11:57 AM, Tom Lane t...@sss.pgh.pa.us wrote: Don't hold your breath. There are two ways the system could implement the DISTINCT clause: either sort and uniq, or hashaggregate. hashaggregate will destroy any input ordering, so there's no value in using the index as input. Isn't that an implementation limitation though, rather than a fundamental limitation? Perhaps, but it's not a simple one to surmount, and I'm dubious about putting the amount of work that'd be required into such a corner case. I don't think so this use case is too special - but workaround working well Regards Pavel regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.3] Row-Level Security
2012/10/22 Robert Haas robertmh...@gmail.com: On Thu, Oct 18, 2012 at 2:19 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Kohei KaiGai escribió: The revised patch fixes the problem that Daen pointed out. Robert, would you be able to give this latest version of the patch a look? Yeah, sorry I've been completely sidelined this CommitFest. It's been a crazy couple of months. Prognosis for future craziness reduction uncertain. Comments: The documentation lists several documented limitations that I would like to analyze a little bit. First, it says that row-level security policies are not applied on UPDATE or DELETE. That sounds downright dangerous to me. Is there some really compelling reason we're not doing it? It intends to simplify the patch to avoid doing everything within a single patch. I'll submit the patch supporting UPDATE and DELETE for CF-Nov in addition to the base one. Second, it says that row-level security policies are not currently applied on INSERT, so you should use a trigger, but implies that this will change in the future. I don't think we should change that in the future; I think relying on triggers for that case is just fine. Note that it could be an issue with the post-image for UPDATES, as well, and I think the trigger solution is similarly adequate to cover that case. Hmm. I should not have written this in section of the current limitation. It may give impression the behavior will be changed future. OK, I'll try to revise the documentation stuff. With respect to the documented limitation regarding DECLARE/FETCH, what exactly will happen? Can we describe this a bit more clearly rather than just saying the behavior will be unpredictable? In case when user-id was switched after declaration of a cursor that contains qualifier depending on current_user, its results set contains rows with old user-id and rows with new user-id. Here is one other option rather than documentation fix. As we had a discussion on the upthread, it can be solved if we restore the user-id associated with the portal to be run, however, a problem is some commands switches user-id inside of the portal. http://archives.postgresql.org/pgsql-hackers/2012-07/msg00055.php Is there some good idea to avoid the problem? It looks suspiciously as if the row-level security mode needs to be saved and restored in all the same places we call save and restore the user ID and security context. Is there some reason the row-level-security-enabled flag shouldn't just become another bit in the security context? Then we'd get all of this save/restore logic mostly for free. It seems to me a good idea, but I didn't find out this. ATExecSetRowLevelSecurity() calls SetRowLevelSecurity() or ResetRowLevelSecurity() to update pg_rowlevelsec, but does the pg_class update itself. I think that all of this logic should be moved into a single function, or at least functions in the same file, with the one that only updates pg_rowlevelsec being static and therefore not able to be called from outside the file. We always need the pg_class update and the pg_rowlevelsec update to happen together, so it's not good to have an exposed function that does one of those updates but not the other. I think the simplest thing is just to move ATExecSetRowLevelSecurity to pg_rowlevelsec.c and rename it to SetRowLevelSecurity() and then give it two static helper functions, say InsertPolicyRow() and DeletePolicyRow(). OK, I'll rework the code. I think it would be good if Tom could review the query-rewriting parts of this (viz rowlevelsec.c) as I am not terribly familiar with this machinery, and of course anything we get wrong here will have security consequences. At first blush, I'm somewhat concerned about the fact that we're trying to do this after query rewriting; that seems like it could break things. I know KaiGai mentioned upthread that the rewriter won't be rerun if the plan is invalidated, but (1) why is that OK now? and (2) if it is OK now, then why is it OK to do rewriting of the RLS qual - only - after rewriting if all of the rest of the rewriting needs to happen earlier? I just follow the existing behavior of plan invalidation; that does not re-run the query rewriter. So, if we have no particular reason why we should not run the rewriter again to handle RLS quals, it might be an option to handle RLS as a part of rewriter. At least, here is two problems. 1) System column is problematic when SELECT statement is replaced by sub-query. 2) It makes infinite recursion when a certain table has SELECT INSTEAD rule with a sub-query on the same table. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [Bug] SELECT INSTEAD with sub-query
I could find out a case to cause Assert() failure during investigation of RLS... postgres=# CREATE TABLE t1 (x int, y text); CREATE TABLE postgres=# CREATE RULE _RETURN AS ON SELECT TO t1 DO INSTEAD SELECT 1 AS x, 'aaa'::text AS y; CREATE RULE postgres=# SELECT * FROM t1; x | y ---+- 1 | aaa (1 row) postgres=# SELECT tableoid, * FROM t1; TRAP: FailedAssertion(!(attno = rel-min_attr attno = rel-max_attr), File: initsplan.c, Line: 180) The connection to the server was lost. Attempting reset: LOG: server process (PID 27345) was terminated by signal 6: Aborted DETAIL: Failed process was running: SELECT tableoid, * FROM t1; The failure scenario is obvious. Reference to t1 was replaced by a sub-query that does not have any system columns, however, the given query tries to the column with negative attribute number. Then, it was caught on the FailedAssertion. How do we fix the problem? Please give us some comments. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM
On Wed, Oct 10, 2012 at 4:24 PM, Marko Kreen mark...@gmail.com wrote: The SCRAM looks good from the quick glance. SCRAM does have weakness - the info necessary to log in as client (ClientKey) is exposed during authentication process. IOW, the stored auth info can be used to log in as client, if the attacker can listen on or participate in login process. The random nonces used during auth do not matter, what matters is that the target server has same StoredKey (same password, salt and iter). It seems this particular attack is avoided by SRP. This weakness can be seen as feature tho - it can be used by poolers to cache auth info and re-connect to server later. They need full access to stored keys still. But it does make it give different security guaratees depending whether SSL is in use or not. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
On Fri, Oct 19, 2012 at 2:55 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Oct 19, 2012 at 10:29 AM, Andrew Dunstan and...@dunslane.net wrote: As you can see, in the case of rewrite it takes us back 7 1/2 years. I know this is a *very* rough measure, but it still tends to indicate to me that the maintenance burden isn't terribly high. That's a pretty neat one-liner. However... in my view, the real cost of rules is that they are hard to support as we add new features to SQL. I believe we already decided to punt on making them work with CTEs... and maybe one other case? I don't really remember the details any more, but presumably this will come up again with MERGE, and perhaps other cases... Good point on the CTE (and it's correct). I think by any reasonable definition rules are in fact already de facto deprecated: they are not being extended to interact with other features and the community is advising against their use. I don't think anybody would complain if/when a hypothetical MERGE feature was advanced without rule interaction. That said, I don't think there is any reasonable argument to remove rules. Backwards compatibility should only be broken when it *must* be broken. Any 'developer interest only' standards ('grotty code', 'inelegant', 'ill advised for new code', etc) of removal are completely specious and thus are IMSNHO irrelevant. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM
On Sun, Oct 21, 2012 at 5:49 PM, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: I don't see a problem at all with providing the snakeoil cert. In fact, it's quite useful. I see a problem with enabling it by default. Because it makes people think they are more secure than they are. I am far from an SSL expert, but I had the idea that the only problem with a self-signed cert is that the client can't trace it to a trusted cert --- so if the user took the further step of copying the cert to the client machines' ~/.postgresql/root.crt files, wouldn't things be just fine? I'm not sure if server certs are supposed to go in the root.crt file or somewhere else. It's a bit tricky to distribute them securely but most people will just scp them and call that good since they ignore the ssh host key messages anyways. Fwiw the main problem here is that you're vulnerable to a MitM attack. In theory we could work around that by doing something like encrypting the ssl public key in a key based on a query provided by the user. That query would have to include some server data that the client can predict and that only the correct server would have access to. There are obvious problems with this though and inventing our own security protocol is almost certainly a bad idea even if we can fix them. In a browser, they will get a big fat warning every time, so they will know it. There is no such warning in psql. Actually, maybe we should *add* such a warning. We could do it in psql. We can't do it in libpq for everyone, but we can do it in our own tools... Particularly since we do print the SSL information already - we could just add a warning: cert not verified or something like that to the same piece of information. I think we can provide a much better warning however. I think we want something like 'WARNING: Server identity signed by unknown and untrusted authority Snakeoil CA' We could go even further: INFO: Server identity ACME Debian Machine certified by Snakeoil CA WARNING: Server identity signed by unknown and untrusted authority Snakeoil CA HINT: Add either the server certificate or the CA certificate to /usr/lib/ssl/certs after verifying the identity and certificate hash SSL is notoriously hard to set up, it would go a long way to give the sysadmin an immediate pointer to what certificates are being used and where to find or install the CA certs. It might be worth mentioning the GUC parameter names to control these things too. What happens in the other direction, ie if a client presents a self-signed cert that the server can't verify? Surely that's just a failure. The server always expects client authentication and a connection authenticated using an unverified cert could be anyone at all. Clients traditionally didn't authenticate the server until encrypted connections entered the picture and preventing MitM attacks became relevant. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo: KNN Search should to support DISTINCT clasuse?
On Mon, Oct 22, 2012 at 4:57 PM, Tom Lane t...@sss.pgh.pa.us wrote: Don't hold your breath. There are two ways the system could implement the DISTINCT clause: either sort and uniq, or hashaggregate. hashaggregate will destroy any input ordering, so there's no value in using the index as input. sort and uniq requires the input to be sorted by *all* the columns being distinct'ed, not just one, so again this index isn't useful. We already have some bits that understand functional dependencies for distinct/group by don't we? Do we detect that col - 'foo' is functionally dependent on col? If so is it possible to construct a multicolumn index that can produce an ordering like [col - 'foo', col] which could be used to get distinct values of col in the knn order (since the first column is functionally dependent on the second it can be ignored for grouping purposes). Not that we can do this now but I wonder whether a lot of the pieces are already there and just need to be hooked up together. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Merlin Moncure Sent: Monday, October 22, 2012 6:54 PM To: Robert Haas Cc: Andrew Dunstan; Josh Berkus; Daniel Farina; pgsql- hack...@postgresql.org Subject: Re: [HACKERS] Deprecating RULES Good point on the CTE (and it's correct). I think by any reasonable definition rules are in fact already de facto deprecated: they are not being extended to interact with other features and the community is advising against their use. I don't think anybody would complain if/when a hypothetical MERGE feature was advanced without rule interaction. That said, I don't think there is any reasonable argument to remove rules. Backwards compatibility should only be broken when it *must* be broken. Any 'developer interest only' standards ('grotty code', 'inelegant', 'ill advised for new code', etc) of removal are completely specious and thus are IMSNHO irrelevant. merlin While I agree with this sentiment to some degree in order for the community to thrive new developer blood needs to be introduced periodically. Not that this feature is particularly an issue but making the codebase easier to learn and maintain has considerable value in its own right. To put a different spin on things it is like CREATE RULE is a specialty tool. Taken that way we should strictly describe the uses-cases where CREATE RULE behavior is well-defined and problem free. If the end-user isn't trying to use RULEs in exactly those cases then they are advised to attempt another solution or send an e-mail to the list to get some expert opinions on that particular use-case. Known problematic uses can also be listed to minimize the amount of not listed, what do y'all think e-mails sent to the list. In this setup there is some developer obligation to try and not break those well-defined use-cases; but that exists today even if it is not explicitly mentioned. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
On Wed, Oct 17, 2012 at 7:48 PM, Christopher Browne cbbro...@gmail.com wrote: Well, replication is arguably a relevant case. For Slony, the origin/master node never cares about logged changes - that data is only processed on replicas. Now, that's certainly a little weaselly - the log data (sl_log_*) has got to get read to get to the replica. Well this is a clever way for Slony to use existing infrastructure to get data into the WAL. But wouldn't it be more logical for an in-core system to just annotate the existing records with enough information to replay them logically? Instead of synthesizing inserts into an imaginary table containing data that can be extracted to retrieve info about some other record, just add the info needed to the relevant record. The minimum needed for DML afaict is DELETE and UPDATE records need the primary key of the record being deleted and updated. It might make sense to include the whole tupledesc or at least key parts of it like the attlen and atttyp array so that replay can be more robust. But the logical place for this data, it seems to me, is *in* the update or insert record that already exists. Otherwise managing logical standbies will require a whole duplicate set of infrastructure to keep track of what has and hasn't been replayed. For instance what if an update record is covered by a checkpoint but the logical record falls after the checkpoint and the system crashes before writing it out? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug] SELECT INSTEAD with sub-query
Kohei KaiGai kai...@kaigai.gr.jp writes: postgres=# CREATE TABLE t1 (x int, y text); CREATE TABLE postgres=# CREATE RULE _RETURN AS ON SELECT TO t1 DO INSTEAD SELECT 1 AS x, 'aaa'::text AS y; CREATE RULE postgres=# SELECT * FROM t1; x | y ---+- 1 | aaa (1 row) postgres=# SELECT tableoid, * FROM t1; TRAP: FailedAssertion(!(attno = rel-min_attr attno = rel-max_attr), File: initsplan.c, Line: 180) Huh. I'm amazed nobody noticed this before. It's really a bug in the convert table to view logic: a view has no system columns so we ought to remove the pg_attribute entries for the system columns, but we don't. Given that we can't retroactively fix the pg_attribute rows for existing views, I guess that we'll also have to put in a defense in the parser to not believe that views have any system columns, even if pg_attribute claims they do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers