Re: [HACKERS] New WAL code dumps core trivially on replay of bad data
On 18.08.2012 08:52, Amit kapila wrote: Tom Lane Sent: Saturday, August 18, 2012 7:16 AM so it merrily tries to compute a checksum on a gigabyte worth of data, and soon falls off the end of memory. In reality, inspection of the WAL file suggests that this is the end of valid data and what should have happened is that replay just stopped. The xl_len and so forth shown above are just garbage from off the end of what was actually read from the file (everything beyond offset 0xcebff8 in file 4 is in fact zeroes). I'm not sure whether this is just a matter of having failed to sanity-check that xl_tot_len is at least SizeOfXLogRecord, or whether there is a deeper problem with the new design of continuation records that makes it impossible to validate records safely. Earlier there was a check related to total length in ReadRecord, before it calls RecordIsValid() if (record-xl_tot_len SizeOfXLogRecord + record-xl_len || record-xl_tot_len SizeOfXLogRecord + record-xl_len + XLR_MAX_BKP_BLOCKS * (sizeof(BkpBlock) + BLCKSZ)) I think that missing check of total length has caused this problem. However now this check will be different. That check still exists, in ValidXLogRecordHeader(). However, we now allocate the buffer for the whole record before that check, based on xl_tot_len, if the record header is split across pages. The theory in allocating the buffer is that a bogus xl_tot_len field will cause the malloc() to fail, returning NULL, and we treat that the same as a broken header. However, with memory overcommit, what happens is that the malloc() succeeds, but the process is killed when it actually tries to use all that memory. I think we need to delay the allocation of the record buffer. We need to read and validate the whole record header first, like we did before, before we trust xl_tot_len enough to call malloc() with it. I'll take a shot at doing that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] State of the on-disk bitmap index
Daniel Bausch wrote: Hello Jonah, Simon, and the hackers, I am going to implement a simple kind of encoded bitmap indexes (EBI). That is an index type where the bitmap columns may not only contain only a single '1' in the set of bits belonging to a tuple. Instead, an additional mapping table translates the distinct values of the table column into a unique encoding. To select for a given value all bitmap columns must be compared instead of only one. Queries that match multiple different values (like IN lists or range queries) simplify to less than the full set of bitmaps that needs to be compared because of boolean logic. The total number of bitmaps required to represent unique encodings for all different values is ceil(ld(n)), where n is the number of distinct values. Compared to normal bitmap indexes this solves the problem of high-cardinality columns. It is targetet at data warehousing scenarios with insert only data. The respective scientific paper can be found at http://www.dvs.tu-darmstadt.de/publications/pdf/ebi_a4.pdf I cannot answer your questions, but I read the paper and have some questions myself. 1) As you mention, a WHERE clause that checks for only one value will be more expensive with an encoded bitmap index than with a regular bitmap index. If you want to implement encoded bitmap indexes, wouldn't it be good to also implement regular bitmap indexes so that the user has a choice? 2) The paper mentions that finding a good encoding and simplifying bitmap access for a certain query are nontrivial problems. Moreover, an encoding is good or bad only with respect to certain queries, which the system does not know at index creation time. Do you have any ideas how to approach that? If not, the paper suggests that, with enough values to check for, even a non-optimized encoded bitmap index should perform much better than a normal bitmap index, so maybe that's the way to go (maybe only encode the NULL value as all zeros). Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Primary Key Constraint on inheritance table not getting route to child tables
Hi, ALTER TABLE ADD Constraints PRIMARY KEY on inheritance table not getting route to child table. But when we do ALTER TABLE DROP Constraint on the same, it complains about constraint does not exists on child table. Consider the following example psql=# CREATE TABLE measurement ( psql(# city_id int not null, psql(# logdate date not null, psql(# peaktempint, psql(# unitsales int psql(# ); CREATE TABLE psql=# CREATE TABLE measurement_y2006m02 ( psql(# CHECK ( logdate = DATE '2006-02-01' AND logdate DATE '2006-03-01' ) psql(# ) INHERITS (measurement); CREATE TABLE psql=# CREATE TABLE measurement_y2006m03 ( psql(# CHECK ( logdate = DATE '2006-03-01' AND logdate DATE '2006-04-01' ) psql(# ) INHERITS (measurement); CREATE TABLE psql=# psql=# psql=# ALTER TABLE measurement ADD CONSTRAINT con1 PRIMARY KEY (city_id); ALTER TABLE psql=# psql=# -- Don't have primary key on child table psql=# desc measurement_y2006m02 Table public.measurement_y2006m02 Column |Type | Modifiers ---+-+--- city_id | integer | not null logdate | timestamp without time zone | not null peaktemp | integer | unitsales | integer | Check constraints: measurement_y2006m02_logdate_check CHECK (logdate = '01-FEB-06 00:00:00'::timestamp without time zone AND logdate '01-MAR-06 00:00:00'::timestamp without time zone) Inherits: measurement -- Primary key on parent table psql=# desc measurement Table public.measurement Column |Type | Modifiers ---+-+--- city_id | integer | not null logdate | timestamp without time zone | not null peaktemp | integer | unitsales | integer | Indexes: con1 PRIMARY KEY, btree (city_id) Number of child tables: 2 (Use \d+ to list them.) *psql=# ALTER TABLE measurement* *DROP CONSTRAINT con1;* *ERROR: constraint con1 of relation measurement_y2006m02 does not exist * I am not sure whether PRIMARY KEY not getting route is a expected behavior or not, but if its expected behavior then obviously DROP CONSTRAINT should not complain about constraint doesn't exists on child table. Inputs/Comments ? Thanks, Rushabh Lathia www.EnterpriseDB.com
Re: [HACKERS] [PATCH] Docs: Make notes on sequences and rollback more obvious
Trying again with the attachments; the archiver only seemed to see the first patch despite all three being attached. Including patches inline; if you want 'em prettier, see: https://github.com/ringerc/postgres/tree/sequence_documentation_fixes Subject: [PATCH 1/3] Make sure you can't read through mvcc.sgml without realising that not everything is MVCC. --- doc/src/sgml/mvcc.sgml | 12 1 file changed, 12 insertions(+) diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index 8f88582..9dc65f5 *** a/doc/src/sgml/mvcc.sgml --- b/doc/src/sgml/mvcc.sgml *** *** 260,265 --- 260,277 command xref linkend=sql-set-transaction. /para +important + para +Some productnamePostgreSQL/productname data types and functions have +special rules regarding transactional behaviour. In particular, changes +made to a literalSEQUENCE/literal (and therefore the counter of a +literalSERIAL/literal) are immediately visible to all other +transactions and are not rolled back if the transaction that made the +changes aborts. See xref linkend=functions-sequence and +xref linkend=datatype-serial. + /para +/important + sect2 id=xact-read-committed titleRead Committed Isolation Level/title -- 1.7.11.2 Subject: [PATCH 2/3] Collect a list of features with abberant transactional behaviour --- doc/src/sgml/mvcc.sgml | 67 ++ 1 file changed, 67 insertions(+) diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index 9dc65f5..e2930c9 *** a/doc/src/sgml/mvcc.sgml --- b/doc/src/sgml/mvcc.sgml *** SELECT pg_advisory_lock(q.id) FROM *** 1540,1543 --- 1540,1610 indexes should be used instead. /para /sect1 + + sect1 id=mvcc-exceptions +titleExceptions to normal transactional rules/title + +para + Some PostgreSQL features, functions and data types differ from the + usual transactional behaviour described in this chapter. Differences + are generally mentioned in the documentation sections for the + features they affect. Such exceptions are collected here for + easy reference. +/para + +para + The following actions and features don't follow the typical + transactional rules: +/para + +itemizedlist + listitem + para + Serial pseudo-types xref linkend=datatype-serial + /para + /listitem + listitem + para + literalSEQUENCE/literals - xref linkend=functions-sequence + /para + /listitem + listitem + para + Advisory locks - xref linkend=advisory-locks + /para + /listitem + listitem + para + Disk writes to files outside the database, as performed by + literalCOPY ... TO/literal, adminpack functions, and other add-ons. + See xref linkend=sql-copy, xref linkend=adminpack. + /para + /listitem + listitem + para + Any network I/O or inter-process communication not explicitly + described as transactional in its documentation. For example, + sending an email from PL/PerlU would not be transactional; + the email would be sent before the transaction commits and + could not be un-sent if the transaction were to roll back. + /listitem +/itemizedlist + +note + para + When working with external non-transactional resources like files + on disk or network sockets the two-phase commit feature can be + useful. See: xref linkend=sql-prepare-transaction + /para + para + LISTEN/NOTIFY provides a lighter weight but still transaction-friendly method of + triggering changes outside the database in response to changes inside the + database. A LISTENing helper program running outside the database can + perform actions when it gets a NOTIFY after a transaction commits. See: + xref linkend=sql-notify. + /para +/note + + /sect1 + /chapter -- 1.7.11.2 Subject: [PATCH 3/3] Change xref of important/ note re SERIAL to point to mvcc-exceptions --- doc/src/sgml/mvcc.sgml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index e2930c9..0de4b75 *** a/doc/src/sgml/mvcc.sgml --- b/doc/src/sgml/mvcc.sgml *** *** 267,274 made to a literalSEQUENCE/literal (and therefore the counter of a literalSERIAL/literal) are immediately visible to all other transactions and are not rolled back if the transaction that made the !changes aborts. See xref linkend=functions-sequence and !xref linkend=datatype-serial. /para /important --- 267,273 made to a literalSEQUENCE/literal (and therefore the counter of a literalSERIAL/literal)
Re: [HACKERS] State of the on-disk bitmap index
Am 20.08.2012 09:40, schrieb Albe Laurenz: Daniel Bausch wrote: Hello Jonah, Simon, and the hackers, I am going to implement a simple kind of encoded bitmap indexes (EBI). That is an index type where the bitmap columns may not only contain only a single '1' in the set of bits belonging to a tuple. Instead, an additional mapping table translates the distinct values of the table column into a unique encoding. To select for a given value all bitmap columns must be compared instead of only one. Queries that match multiple different values (like IN lists or range queries) simplify to less than the full set of bitmaps that needs to be compared because of boolean logic. The total number of bitmaps required to represent unique encodings for all different values is ceil(ld(n)), where n is the number of distinct values. Compared to normal bitmap indexes this solves the problem of high-cardinality columns. It is targetet at data warehousing scenarios with insert only data. The respective scientific paper can be found at http://www.dvs.tu-darmstadt.de/publications/pdf/ebi_a4.pdf I cannot answer your questions, but I read the paper and have some questions myself. 1) As you mention, a WHERE clause that checks for only one value will be more expensive with an encoded bitmap index than with a regular bitmap index. If you want to implement encoded bitmap indexes, wouldn't it be good to also implement regular bitmap indexes so that the user has a choice? Sorry if that one was not clear: The first thing, I am going to do, is to work on the normal bitmap indexes (the one based on the Bizgres patch). I want to port it to master HEAD and give it back to the community. After that I want to base my EBI implementation on that. Eventually, I will publish that implementation, too. (After doing tuning, experiments, and make sure it works well.) 2) The paper mentions that finding a good encoding and simplifying bitmap access for a certain query are nontrivial problems. Moreover, an encoding is good or bad only with respect to certain queries, which the system does not know at index creation time. Actually, I was not involved in writing that paper. I want to use that idea to show something different. I know of a follow up work by Golam Rabilul Alam et al. that uses the query history and data mining on that to optimize for the most common cases. There may be others. A more detailed discussion of EBI can also be found in: http://www-old.dvs.informatik.tu-darmstadt.de/staff/wu/query.TR.ps.gz Do you have any ideas how to approach that? If not, the paper suggests that, with enough values to check for, even a non-optimized encoded bitmap index should perform much better than a normal bitmap index, so maybe that's the way to go (maybe only encode the NULL value as all zeros). Actually all zeros is reserved for non-existent (a.k.a. deleted or invisible). The thing with the enough values is a bit problematic, indeed, because even a DBA cannot influence how the queries of the user or the user application look like. You will not use encoded bitmap indexes or normal bitmap indexes for a column that is usually point accessed like the ID column. For that you will stick to hash or tree indexes. Kind regards, Daniel -- Daniel Bausch Wissenschaftlicher Mitarbeiter Technische Universität Darmstadt Fachbereich Informatik Fachgebiet Datenbanken und Verteilte Systeme Hochschulstraße 10 64289 Darmstadt Germany Tel.: +49 6151 16 6706 Fax: +49 6151 16 6229 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] State of the on-disk bitmap index
Am 20.08.2012 11:44, schrieb Daniel Bausch: Actually, I was not involved in writing that paper. I want to use that idea to show something different. I know of a follow up work by Golam Rabilul Alam et al. that uses the query history and data mining on that to optimize for the most common cases. There may be others. A more detailed discussion of EBI can also be found in: http://www-old.dvs.informatik.tu-darmstadt.de/staff/wu/query.TR.ps.gz Oops, that was the wrong link. I meant this one: http://www-old.dvs.informatik.tu-darmstadt.de/staff/wu/bitmap.ps.gz -- Daniel Bausch Wissenschaftlicher Mitarbeiter Technische Universität Darmstadt Fachbereich Informatik Fachgebiet Datenbanken und Verteilte Systeme Hochschulstraße 10 64289 Darmstadt Germany Tel.: +49 6151 16 6706 Fax: +49 6151 16 6229 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enhanced error fields
Hello here is updated patch - merge comments, docs, formatting, some identifiers from Peter Geoghegan's patch Regards Pavel 2012/7/18 Pavel Stehule pavel.steh...@gmail.com: Hello * renamed erritem to err_generic_string * fixed CSVlog generation * new file /utils/error/relerror.c with axillary functions - declarations are in utils/rel.h Regards Pavel 2012/7/11 Tom Lane t...@sss.pgh.pa.us: Alvaro Herrera alvhe...@commandprompt.com writes: FWIW about the new include: I feel a strong dislike about the forward declaration you suggest. Defining Relation in elog.h seems completely out of place. Agreed. Maybe a reasonable solution is to allow some ereport helper functions (or, really, wrappers for the helper functions) to be declared someplace else than elog.h. They'd likely need to be implemented someplace else than elog.c, too, so this doesn't seem unreasonable. The generic helper function approach doesn't seem too unreasonable for this: elog.h/.c would provide something like err_generic_string(int fieldid, const char *str) and then someplace else could provide functions built on this that insert table/schema/column/constraint/etc names into suitable fields. regards, tom lane eelog-2012-08-20.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New WAL code dumps core trivially on replay of bad data
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 18.08.2012 08:52, Amit kapila wrote: I think that missing check of total length has caused this problem. However now this check will be different. That check still exists, in ValidXLogRecordHeader(). However, we now allocate the buffer for the whole record before that check, based on xl_tot_len, if the record header is split across pages. The theory in allocating the buffer is that a bogus xl_tot_len field will cause the malloc() to fail, returning NULL, and we treat that the same as a broken header. Uh, no, you misread it. xl_tot_len is *zero* in this example. The problem is that RecordIsValid believes xl_len (and backup block size) even when it exceeds xl_tot_len. I think we need to delay the allocation of the record buffer. We need to read and validate the whole record header first, like we did before, before we trust xl_tot_len enough to call malloc() with it. I'll take a shot at doing that. I don't believe this theory at all. Overcommit applies to writing on pages that were formerly shared with the parent process --- it should not have anything to do with malloc'ing new space. But anyway, this is not what happened in my example. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New WAL code dumps core trivially on replay of bad data
On Monday, August 20, 2012 04:04:52 PM Tom Lane wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 18.08.2012 08:52, Amit kapila wrote: I think that missing check of total length has caused this problem. However now this check will be different. That check still exists, in ValidXLogRecordHeader(). However, we now allocate the buffer for the whole record before that check, based on xl_tot_len, if the record header is split across pages. The theory in allocating the buffer is that a bogus xl_tot_len field will cause the malloc() to fail, returning NULL, and we treat that the same as a broken header. Uh, no, you misread it. xl_tot_len is *zero* in this example. The problem is that RecordIsValid believes xl_len (and backup block size) even when it exceeds xl_tot_len. I think we need to delay the allocation of the record buffer. We need to read and validate the whole record header first, like we did before, before we trust xl_tot_len enough to call malloc() with it. I'll take a shot at doing that. I don't believe this theory at all. Overcommit applies to writing on pages that were formerly shared with the parent process --- it should not have anything to do with malloc'ing new space. But anyway, this is not what happened in my example. If the memory is big enough (128kb) it will be mmap'ed into place. In that case overcommiting applies before the pages have been brought in. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] alter enum add value if not exists
Here is a patch for this feature, which should alleviate some of the woes caused by adding labels not being transactional (and thus not allowing for the catching of errors). (Also available on the add_enum_ine branch at https://bitbucket.org/adunstan/pgdevel) cheers andrew *** a/doc/src/sgml/ref/alter_type.sgml --- b/doc/src/sgml/ref/alter_type.sgml *** *** 28,34 ALTER TYPE replaceable class=PARAMETERname/replaceable OWNER TO replaceab ALTER TYPE replaceable class=PARAMETERname/replaceable RENAME ATTRIBUTE replaceable class=PARAMETERattribute_name/replaceable TO replaceable class=PARAMETERnew_attribute_name/replaceable ALTER TYPE replaceable class=PARAMETERname/replaceable RENAME TO replaceable class=PARAMETERnew_name/replaceable [ CASCADE | RESTRICT ] ALTER TYPE replaceable class=PARAMETERname/replaceable SET SCHEMA replaceable class=PARAMETERnew_schema/replaceable ! ALTER TYPE replaceable class=PARAMETERname/replaceable ADD VALUE replaceable class=PARAMETERnew_enum_value/replaceable [ { BEFORE | AFTER } replaceable class=PARAMETERexisting_enum_value/replaceable ] phrasewhere replaceable class=PARAMETERaction/replaceable is one of:/phrase --- 28,34 ALTER TYPE replaceable class=PARAMETERname/replaceable RENAME ATTRIBUTE replaceable class=PARAMETERattribute_name/replaceable TO replaceable class=PARAMETERnew_attribute_name/replaceable ALTER TYPE replaceable class=PARAMETERname/replaceable RENAME TO replaceable class=PARAMETERnew_name/replaceable [ CASCADE | RESTRICT ] ALTER TYPE replaceable class=PARAMETERname/replaceable SET SCHEMA replaceable class=PARAMETERnew_schema/replaceable ! ALTER TYPE replaceable class=PARAMETERname/replaceable ADD VALUE [ IF NOT EXISTS ] replaceable class=PARAMETERnew_enum_value/replaceable [ { BEFORE | AFTER } replaceable class=PARAMETERexisting_enum_value/replaceable ] phrasewhere replaceable class=PARAMETERaction/replaceable is one of:/phrase *** *** 106,112 ALTER TYPE replaceable class=PARAMETERname/replaceable ADD VALUE replacea /varlistentry varlistentry ! termliteralADD VALUE [ BEFORE | AFTER ]/literal/term listitem para This form adds a new value to an enum type. If the new value's place in --- 106,112 /varlistentry varlistentry ! termliteralADD VALUE [ IF NOT EXISTS ] [ BEFORE | AFTER ]/literal/term listitem para This form adds a new value to an enum type. If the new value's place in *** *** 114,119 ALTER TYPE replaceable class=PARAMETERname/replaceable ADD VALUE replacea --- 114,124 literalAFTER/literal, then the new item is placed at the end of the list of values. /para + para + If literalIF NOT EXISTS/literal is used, it is not an error if the + type already contains the new value, and no action is taken. Otherwise, + an error will occur if the new value is already present. + /para /listitem /varlistentry *** a/src/backend/catalog/pg_enum.c --- b/src/backend/catalog/pg_enum.c *** *** 177,183 void AddEnumLabel(Oid enumTypeOid, const char *newVal, const char *neighbor, ! bool newValIsAfter) { Relation pg_enum; Oid newOid; --- 177,184 AddEnumLabel(Oid enumTypeOid, const char *newVal, const char *neighbor, ! bool newValIsAfter, ! bool skipIfExists) { Relation pg_enum; Oid newOid; *** *** 199,204 AddEnumLabel(Oid enumTypeOid, --- 200,220 errdetail(Labels must be %d characters or less., NAMEDATALEN - 1))); + /* Do the IF NOT EXISTS test if specified */ + if (skipIfExists) + { + HeapTuple tup; + + tup = SearchSysCache2(ENUMTYPOIDNAME, + ObjectIdGetDatum(enumTypeOid), + CStringGetDatum(newVal)); + if (HeapTupleIsValid(tup)) + { + ReleaseSysCache(tup); + return; + } + } + /* * Acquire a lock on the enum type, which we won't release until commit. * This ensures that two backends aren't concurrently modifying the same *** a/src/backend/commands/typecmds.c --- b/src/backend/commands/typecmds.c *** *** 1187,1193 AlterEnum(AlterEnumStmt *stmt) /* Add the new label */ AddEnumLabel(enum_type_oid, stmt-newVal, ! stmt-newValNeighbor, stmt-newValIsAfter); ReleaseSysCache(tup); } --- 1187,1194 /* Add the new label */ AddEnumLabel(enum_type_oid, stmt-newVal, ! stmt-newValNeighbor, stmt-newValIsAfter, ! stmt-skipIfExists); ReleaseSysCache(tup); } *** a/src/backend/nodes/copyfuncs.c --- b/src/backend/nodes/copyfuncs.c *** *** 3041,3046 _copyAlterEnumStmt(const AlterEnumStmt *from) --- 3041,3047 COPY_STRING_FIELD(newVal); COPY_STRING_FIELD(newValNeighbor); COPY_SCALAR_FIELD(newValIsAfter); + COPY_SCALAR_FIELD(skipIfExists); return newnode; } ***
Re: [HACKERS] New WAL code dumps core trivially on replay of bad data
On 20.08.2012 17:04, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: On 18.08.2012 08:52, Amit kapila wrote: I think that missing check of total length has caused this problem. However now this check will be different. That check still exists, in ValidXLogRecordHeader(). However, we now allocate the buffer for the whole record before that check, based on xl_tot_len, if the record header is split across pages. The theory in allocating the buffer is that a bogus xl_tot_len field will cause the malloc() to fail, returning NULL, and we treat that the same as a broken header. Uh, no, you misread it. xl_tot_len is *zero* in this example. The problem is that RecordIsValid believes xl_len (and backup block size) even when it exceeds xl_tot_len. Ah yes, I see that now. I think all we need then is a check for xl_tot_len = SizeOfXLogRecord. I think we need to delay the allocation of the record buffer. We need to read and validate the whole record header first, like we did before, before we trust xl_tot_len enough to call malloc() with it. I'll take a shot at doing that. I don't believe this theory at all. Overcommit applies to writing on pages that were formerly shared with the parent process --- it should not have anything to do with malloc'ing new space. But anyway, this is not what happened in my example. I was thinking that we might read gigabytes worth of bogus WAL into the memory buffer, if xl_tot_len is bogus and large, e.g 0x. But now that I look closer, the xlog record is validated after reading the first continuation page, so we should catch a bogus xl_tot_len value at that point. And there is a cross-check with xl_rem_len on every continuation page, too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New WAL code dumps core trivially on replay of bad data
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 20.08.2012 17:04, Tom Lane wrote: Uh, no, you misread it. xl_tot_len is *zero* in this example. The problem is that RecordIsValid believes xl_len (and backup block size) even when it exceeds xl_tot_len. Ah yes, I see that now. I think all we need then is a check for xl_tot_len = SizeOfXLogRecord. That should get us back to a reliability level similar to the old code. However, I think that we also need to improve RecordIsValid so that at each step, it checks it hasn't overrun xl_tot_len *before* touching the corresponding part of the record buffer. I was thinking that we might read gigabytes worth of bogus WAL into the memory buffer, if xl_tot_len is bogus and large, e.g 0x. But now that I look closer, the xlog record is validated after reading the first continuation page, so we should catch a bogus xl_tot_len value at that point. And there is a cross-check with xl_rem_len on every continuation page, too. Yeah. Even if xl_tot_len is bogus, we should realize that within a couple of pages at most. The core of the problem here is that RecordIsValid is not being careful to confine its touches to the guaranteed-to-exist bytes of the record buffer, ie 0 .. xl_tot_len-1. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree
Jeff Davis pg...@j-davis.com writes: On Sat, 2012-07-28 at 17:50 -0400, Tom Lane wrote: which would come back to bite us if we ever try to support index-only scans with SPGiST. I'm confused: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=92203624934095163f8b57b5b3d7bbd2645da2c8 Sorry, I was being imprecise there. What I meant was that an opclass that abused the reconstructed-value storage for something else might have problems supporting index-only scans. If we think opclasses might need private storage for index searches, we should add that as a new part of the API, not tell them to misuse this part. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding repeated snapshot computation
On Thu, Aug 16, 2012 at 9:02 PM, Bruce Momjian br...@momjian.us wrote: Did we ever make a decision on this patch? I committed it as 1fc3d18faa8f4476944bc6854be0f7f6adf4aec8. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Primary Key Constraint on inheritance table not getting route to child tables
Excerpts from Rushabh Lathia's message of lun ago 20 02:50:52 -0400 2012: Hi, ALTER TABLE ADD Constraints PRIMARY KEY on inheritance table not getting route to child table. But when we do ALTER TABLE DROP Constraint on the same, it complains about constraint does not exists on child table. This is a known 9.2 bug, fixed a month in this commit: Author: Alvaro Herrera alvhe...@alvh.no-ip.org Branch: master [f5bcd398a] 2012-07-20 14:08:07 -0400 Branch: REL9_2_STABLE [d721f208a] 2012-07-20 14:07:09 -0400 connoinherit may be true only for CHECK constraints The code was setting it true for other constraints, which is bogus. Doing so caused bogus catalog entries for such constraints, and in particular caused an error to be raised when trying to drop a constraint of types other than CHECK from a table that has children, such as reported in bug #6712. In 9.2, additionally ignore connoinherit=true for other constraint types, to avoid having to force initdb; existing databases might already contain bogus catalog entries. Includes a catversion bump (in HEAD only). Bug report from Miroslav Šulc Analysis from Amit Kapila and Noah Misch; Amit also contributed the patch. I cannot reproduce it in 9.2 HEAD or master HEAD. I assume you were testing with something older than the above commit; the 9.1 branch does not contain the bug. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The pgrminclude problem
On Thu, Aug 16, 2012 at 12:17 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: On 16 August 2012 16:56, Bruce Momjian br...@momjian.us wrote: Good to know. We only use pgrminclude very five years or so, and Tom isn't even keen on that. Yeah. Even if this could be made to work well, we'd still have to do something like get an absolute consensus from all build farm animals, if we expected to have an absolutely trustworthy list. I don't think pgrminclude is a bad idea. I just think that it should only be used to guide the efforts of a human to remove superfluous #includes, which is how it is used anyway. I actually think we'd probably be better off running pgrminclude once per release cycle rather than any less often. When the number of changes gets into the hundreds or thousands of lines it becomes much more difficult to validate that it's doing anything sensible. I ran it a while back and found a bunch of stuff that looked like it was obviously worth fixing, but I was afraid of getting yelled at if I went and fixed it, so I didn't. Somehow that doesn't seem like an ideal situation... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Primary Key Constraint on inheritance table not getting route to child tables
Rushabh Lathia rushabh.lat...@gmail.com writes: ALTER TABLE ADD Constraints PRIMARY KEY on inheritance table not getting route to child table. Right. But when we do ALTER TABLE DROP Constraint on the same, it complains about constraint does not exists on child table. Works for me in HEAD. What version are you testing? This seems related to some recent bug fixes ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unexpected plperl difference between 8.4 and 9.1
On Sun, Aug 19, 2012 at 2:26 PM, Joel Jacobson j...@trustly.com wrote: After upgrading from 8.4 to 9.1, one of my plperl functions stopped working properly. For some reason, when matching a string using a regex, the $1 variable cannot be returned directly using return_next() but must be set to a variable first. If returned directly, it appears to be cached in some strange way, returning the same value for all 10 rows in the example below. Hrm seems to work for me. What version of perl is this? $ perl -V Summary of my perl5 (revision 5 version 16 subversion 0) configuration: [snip] Characteristics of this binary (from libperl): Compile-time options: HAS_TIMES MYMALLOC PERLIO_LAYERS PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP PERL_PRESERVE_IVUV USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF $!psql baroque= SELECT version(); version - PostgreSQL 9.1.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.7.1 20120721 (prerelease), 64-bit (1 row) baroque= CREATE OR REPLACE FUNCTION test1() RETURNS SETOF NUMERIC AS $BODY$ baroque$ use strict; baroque$ use warnings; baroque$ for(my $i=0 ; $i10; $i++) { baroque$ my $rand = rand(); baroque$ $rand =~ m/(.*)/; baroque$ return_next($1); baroque$ } baroque$ return; baroque$ $BODY$ LANGUAGE plperl; CREATE FUNCTION baroque= baroque= select * from test1(); test1 --- 0.284491935120062 0.213769321886019 0.758221121077565 0.810816779589864 0.649781285447791 0.630792307420037 0.17897035660857 0.876314955338863 0.899575315174307 0.225134707347706 (10 rows)
Re: [HACKERS] The pgrminclude problem
Excerpts from Robert Haas's message of lun ago 20 11:43:44 -0400 2012: I actually think we'd probably be better off running pgrminclude once per release cycle rather than any less often. When the number of changes gets into the hundreds or thousands of lines it becomes much more difficult to validate that it's doing anything sensible. I ran it a while back and found a bunch of stuff that looked like it was obviously worth fixing, but I was afraid of getting yelled at if I went and fixed it, so I didn't. Somehow that doesn't seem like an ideal situation... Alternatively you could post a patch for comment. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug of pg_trgm?
On Sat, Aug 11, 2012 at 8:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: No. ISTM that in_wildcard_meta must be reset before the second loop. Because the meaning of that flag in the first loop is different from that in the second loop. The former and the latter indicate whether the search string has *heading* and *tailing* wildcard character, respectively. No? Oh, good point. Maybe it would be clearer to use two separate flag variables? Agreed. Attached patch uses two separate flag variables. On Sat, Aug 11, 2012 at 8:19 AM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: On Thu, Aug 9, 2012 at 3:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: Probably a minimal fix for this could be made by backing up endword one byte before returning it if in_escape is true when the second loop exits. That would not scale up to preserving the state of in_wildcard_meta, but since the second loop never advances past a meta char, that's okay for the moment. Or what about extending get_wildcard_part() so that it accepts the pointer to in_escape as an argument? generate_wildcard_trgm() can know the last value of in_escape and specify it the next call of get_wildcard_part(). Looks very simple. Yeah, I had considered pushing the state variables out to the caller. If there were any prospect of wanting more state than just in_escape, I'd be for that --- but I don't see any reason to possibly need more, especially in view of your point that in_wildcard_meta isn't really a single flag with an interpretation that remains fixed throughout. I think it's probably better just to take care of the issue inside get_wildcard_part, and not complicate its API. OK. Attached patch fixes the problem as you suggested, i.e., it backs up endword if the second loop exits in an escape pair. Regards, -- Fujii Masao trgm_bugfix_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unexpected plperl difference between 8.4 and 9.1
Excerpts from Alex Hunsaker's message of lun ago 20 12:03:11 -0400 2012: On Sun, Aug 19, 2012 at 2:26 PM, Joel Jacobson j...@trustly.com wrote: After upgrading from 8.4 to 9.1, one of my plperl functions stopped working properly. For some reason, when matching a string using a regex, the $1 variable cannot be returned directly using return_next() but must be set to a variable first. If returned directly, it appears to be cached in some strange way, returning the same value for all 10 rows in the example below. Hrm seems to work for me. What version of perl is this? $ perl -V Summary of my perl5 (revision 5 version 16 subversion 0) configuration: [snip] Characteristics of this binary (from libperl): Compile-time options: HAS_TIMES MYMALLOC PERLIO_LAYERS PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP PERL_PRESERVE_IVUV USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF I can reproduce the failure with 5.14.2 alvherre=# CREATE OR REPLACE FUNCTION test1() RETURNS SETOF NUMERIC AS $BODY$ alvherre$# use strict; alvherre$# use warnings; alvherre$# for(my $i=0 ; $i10; $i++) { alvherre$# my $rand = rand(); alvherre$# $rand =~ m/(.*)/; alvherre$# return_next($1); alvherre$# } alvherre$# return; alvherre$# $BODY$ LANGUAGE plperl; CREATE FUNCTION alvherre=# select * from test1(); test1 --- 0.396088311522366 0.396088311522366 0.396088311522366 0.396088311522366 0.396088311522366 0.396088311522366 0.396088311522366 0.396088311522366 0.396088311522366 0.396088311522366 (10 filas) It works fine if I assign $1 to another variable before return_next'ing it: alvherre=# CREATE OR REPLACE FUNCTION test1() RETURNS SETOF NUMERIC AS $BODY$ use strict; use warnings; for(my $i=0 ; $i10; $i++) { my $rand = rand(); $rand =~ m/(.*)/; my $a=$1; return_next($a); } return; $BODY$ LANGUAGE plperl; CREATE FUNCTION alvherre=# select * from test1(); test1 --- 0.693569484473304 0.757589839023666 0.477233897467283 0.572963701418253 0.189924114046409 0.20155773007717 0.624452321926892 0.134135086596039 0.417606638502921 0.95250325772281 (10 filas) (In short, same as Joel). -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New WAL code dumps core trivially on replay of bad data
On 20.08.2012 18:25, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: I was thinking that we might read gigabytes worth of bogus WAL into the memory buffer, if xl_tot_len is bogus and large, e.g 0x. But now that I look closer, the xlog record is validated after reading the first continuation page, so we should catch a bogus xl_tot_len value at that point. And there is a cross-check with xl_rem_len on every continuation page, too. Yeah. Even if xl_tot_len is bogus, we should realize that within a couple of pages at most. The core of the problem here is that RecordIsValid is not being careful to confine its touches to the guaranteed-to-exist bytes of the record buffer, ie 0 .. xl_tot_len-1. Hmm, RecordIsValid() assumes that the whole record has been read into memory already, where whole record is defined by xl_tot_len. The problem is that xl_len disagrees with xl_tot_len. Validating the XLOG header would've caught that, but in this case the caller had not called ValidXLogRecordHeader(). However, a suitably corrupt record might have a valid header, but *appear* to have larger backup blocks than the header claims. You would indeed overrun the memory buffer while calculating the CRC, then. So yeah, we should check that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The pgrminclude problem
Robert Haas robertmh...@gmail.com writes: On Thu, Aug 16, 2012 at 12:17 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: Yeah. Even if this could be made to work well, we'd still have to do something like get an absolute consensus from all build farm animals, if we expected to have an absolutely trustworthy list. I don't think pgrminclude is a bad idea. I just think that it should only be used to guide the efforts of a human to remove superfluous #includes, which is how it is used anyway. I actually think we'd probably be better off running pgrminclude once per release cycle rather than any less often. If it were more automatic and less prone to give bogus answers, I could get behind that ... but as is, I'd frankly be happier if we *never* ran it. It took quite a lot of effort to dig out from under the mess it made last time, and I don't recall that we have ever had a run that was entirely trouble-free. Now, a contributing factor to the most recent mess was that somebody had created circular header #include's; maybe it would help if the thing were programmed to notice that and punt, rather than doing its best to wind the ball of string even tighter. In general, though, any recommendation from the tool to remove #includes in headers, as opposed to consumer .c files, needs to be taken with about ten grains of salt. The other serious problem, as Peter notes, is that there are inclusions that are only needed on particular platforms or with particular build options. AFAIK, Bruce's current methodology for running pgrminclude takes no account of that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unexpected plperl difference between 8.4 and 9.1
Alvaro Herrera alvhe...@2ndquadrant.com writes: Excerpts from Alex Hunsaker's message of lun ago 20 12:03:11 -0400 2012: Hrm seems to work for me. What version of perl is this? $ perl -V Summary of my perl5 (revision 5 version 16 subversion 0) configuration: I can reproduce the failure with 5.14.2 Smells like a Perl bug to me. Has anybody attempted to reproduce this just in Perl itself, not PL/Perl? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Primary Key Constraint on inheritance table not getting route to child tables
On Mon, Aug 20, 2012 at 9:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: Rushabh Lathia rushabh.lat...@gmail.com writes: ALTER TABLE ADD Constraints PRIMARY KEY on inheritance table not getting route to child table. Right. But when we do ALTER TABLE DROP Constraint on the same, it complains about constraint does not exists on child table. Works for me in HEAD. What version are you testing? This seems related to some recent bug fixes ... Oh ok. Sorry for wrong noise, I was checking this on old version. Thanks, regards, tom lane -- -- Rushabh Lathia Technical Architect EnterpriseDB Corporation The Enterprise Postgres Company Phone: +91-20-30589494 Website: http://www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb
Re: [HACKERS] gistchoose vs. bloat
On Mon, Aug 20, 2012 at 7:13 AM, Jeff Davis pg...@j-davis.com wrote: I took a look at this patch. The surrounding code is pretty messy (not necessarily because of your patch). A few comments would go a long way. The 'which_grow' array is initialized as it goes, first using pointer notations (*which_grows = -1.0) and then using subscript notation. As far as I can tell, the first r-rd_att-natts of the array (the only elements that matter) need to be written the first time through anyway. Why not just replace which_grow[j] 0 with i == FirstOffsetNumber and add a comment that we're initializing the penalties with the first index tuple? The 'sum_grow' didn't make any sense, thank you for getting rid of that. Also, we should document that the earlier attributes always take precedence, which is why we break out of the inner loop as soon as we encounter an attribute with a higher penalty. Please add a comment indicating why you are randomly choosing among the equal penalties. I think that there might be a problem with the logic, as well. Let's say you have two attributes and there are two index tuples, it1 and it2; with penalties [10,10] and [10,100] respectively. The second time through the outer loop, with i = 2, you might (P=0.5) assign 2 to the 'which' variable in the first iteration of the inner loop, before it realizes that it2 actually has a higher penalty. I think you need to finish out the inner loop and have a flag that indicates that all attributes are equal before you do the probabilistic replacement. Current gistchoose code has a bug. I've started separate thread about it. http://archives.postgresql.org/pgsql-hackers/2012-08/msg00544.php Also, it obviously needs more comments. Current state of patch is more proof of concept than something ready. I'm going to change it in following ways: 1) We don't know how expensive user penalty function is. So, I'm going to change randomization algorithm so that it doesn't increase number of penalty calls in average. 2) Since, randomization could produce additional IO, there are probably no optimal solution for all the cases. We could introduce user-visible option which enables or disables randomization. However, default value of this option is another question. Also, I think you should use random() rather than rand(). Thanks, will fix. -- With best regards, Alexander Korotkov.
Re: [HACKERS] bug of pg_trgm?
Fujii Masao masao.fu...@gmail.com writes: OK. Attached patch fixes the problem as you suggested, i.e., it backs up endword if the second loop exits in an escape pair. Applied with a bit of further adjustment of the comments. Thanks! regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rules and WITH and LATERAL
On Sun, Aug 19, 2012 at 12:06:30PM -0400, Tom Lane wrote: While thinking about this I wondered whether it might be possible to clean up the implementation of rules, and perhaps also get rid of some of their semantic issues, by making the rule rewriter rely on WITH and/or LATERAL, neither of which we had back in the dark ages when the current rules implementation was built. In particular, WITH might offer a fix for the multiple-evaluation gotchas that people so often trip over. For instance, perhaps an UPDATE with rules could be rewritten into something like Making the rule system use WITH always seemed like a good idea to me. ISTM though that it would tax the optimiser, as it would need to become much more clever at pushing conditions down. For example, on 9.1 at least you still get this: $ explain with x as (select * from pg_class) select * from x where relname = 'test'; QUERY PLAN - CTE Scan on x (cost=14.15..23.49 rows=2 width=189) Filter: (relname = 'test'::name) CTE x - Seq Scan on pg_class (cost=0.00..14.15 rows=415 width=194) (4 rows) whereas without the with you get an index scan. So in its current form you can't use WITH to simplify the implementation of views because performence would suck. OTOH, the intelligence in the current rule system may be a good guide to optimise WITH statements. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] SERIAL columns in foreign tables
On Fri, Aug 17, 2012 at 10:53 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: However I'm wondering if it'd be better to tweak the code to explicitely check for SERIAL/BIGSERIAL instead of letting it error out in internal conditions. The way it currently is, it seems a bit user-unfriendly to me. I don't think the current error message is horrible, but I don't object to improving it, either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRUE/FALSE vs true/false
On Thu, Aug 16, 2012 at 3:32 PM, Bruce Momjian br...@momjian.us wrote: On Thu, Aug 16, 2012 at 02:21:12PM -0500, Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: So what do we want to do with this? I am a little concerned that we are sacrificing code clarity for backpatching ease, but I don't do as much backpatching as Tom. Well, if you back-patched this change, it would eliminate the issue for Tom, wouldn't it? Not sure if that's sane; just a thought. I would be worried about some instability in backpatching. I was looking for an 'ignore-case' mode to patch, but I don't see it. I have difficult believing that a change of this type, if implemented judiciously, is really going to create that much difficulty in back-patching. I don't do as much back-patching as Tom either (no one does), but most of the patches I do back-patch can be cherry-picked all the way back without a problem. Some require adjustment, but even then this kind of thing is pretty trivial to handle, as it's pretty obvious what happened when you look through it. The really nasty problems tend to come from places where the code has been rearranged, rather than simple A-for-B substitutions. I think the thing we need to look at is what percentage of our code churn is coming from stuff like this, versus what percentage of it is coming from other factors. If we change 250,000 lines of code per release cycle and of that this kind of thing accounts for 5,000 lines of deltas, then IMHO it's not really material. If it accounts for 50,000 lines of deltas out of the same base, that's probably more than can really be justified by the benefit we're going to get out of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sha1, sha2 functions into core?
On 8/15/12 6:48 AM, Tom Lane wrote: The argument against moving crypto code into core remains the same as it was, ie export regulations. I don't see that that situation has changed at all. Actually, I believe that it has, based on my experience getting an export certificate for Sun Postgres back in 2008. The US Federal government lifted restrictions on shipping well-known cryptographic algorithms to most countries several years ago, except to specific countries with embargoes (Iran, Burma, etc.). However, *all* exports of software to those embargoed countries are restricted, cryptographic or not. The USA does require an export certificate for any cryptographic-supporting software which is shipped from the USA. For that, however, MD5 and our support for SSL authentication already requires a certificate, whether we include SHA or not. So, my personal non-lawyer experience is that including SHA in core or not would make no difference whatsoever to our export status. The above is all secondhand legal knowledge, so if it really matters to our decisions on what algorithms we include in Core, we should ask SFLC for a real opinion. We certainly shouldn't make one based on assumptions. I think it's more significant, though, that nobody has been able to demonstrate that SHA hashing of passwords actually makes Postgres more secure. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timing overhead and Linux clock sources
On Thu, Aug 16, 2012 at 10:28 PM, Bruce Momjian br...@momjian.us wrote: FYI, I am planning to go ahead and package this tool in /contrib for PG 9.3. Isn't this exactly what we already did, in 9.2, in the form of contrib/pg_test_timing? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table
On Thu, Aug 16, 2012 at 9:05 PM, Bruce Momjian br...@momjian.us wrote: A TODO for this? You mean this part? On the other hand, the problem of the FSM taking up 24kB for an 8kB table seems clearly worth fixing, but I don't think I have the cycles for it at present. Maybe a TODO is in order. I certainly think that'd be worth a TODO. Whether the rest of this is worth worrying about I'm not sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The pgrminclude problem
On Mon, Aug 20, 2012 at 12:03 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Excerpts from Robert Haas's message of lun ago 20 11:43:44 -0400 2012: I actually think we'd probably be better off running pgrminclude once per release cycle rather than any less often. When the number of changes gets into the hundreds or thousands of lines it becomes much more difficult to validate that it's doing anything sensible. I ran it a while back and found a bunch of stuff that looked like it was obviously worth fixing, but I was afraid of getting yelled at if I went and fixed it, so I didn't. Somehow that doesn't seem like an ideal situation... Alternatively you could post a patch for comment. Yeah, maybe I'll try that if I get back around to working on this at some point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sha1, sha2 functions into core?
On 08/20/2012 03:10 PM, Josh Berkus wrote: On 8/15/12 6:48 AM, Tom Lane wrote: The argument against moving crypto code into core remains the same as it was, ie export regulations. I don't see that that situation has changed at all. Actually, I believe that it has, based on my experience getting an export certificate for Sun Postgres back in 2008. The US Federal government lifted restrictions on shipping well-known cryptographic algorithms to most countries several years ago, except to specific countries with embargoes (Iran, Burma, etc.). However, *all* exports of software to those embargoed countries are restricted, cryptographic or not. The USA does require an export certificate for any cryptographic-supporting software which is shipped from the USA. For that, however, MD5 and our support for SSL authentication already requires a certificate, whether we include SHA or not. So, my personal non-lawyer experience is that including SHA in core or not would make no difference whatsoever to our export status. The above is all secondhand legal knowledge, so if it really matters to our decisions on what algorithms we include in Core, we should ask SFLC for a real opinion. We certainly shouldn't make one based on assumptions. I think it's more significant, though, that nobody has been able to demonstrate that SHA hashing of passwords actually makes Postgres more secure. I don't think US export regulations are the only issue. Some other countries (mostly the usual suspects) forbid the use of crypto software. If we build more crypto functions into the core we make it harder to use Postgres legally in those places. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NOT NULL constraints in foreign tables
On Fri, Aug 17, 2012 at 4:08 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Excerpts from Robert Haas's message of vie ago 17 15:44:29 -0400 2012: On Fri, Aug 17, 2012 at 2:58 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I mean, what are NOT NULL in foreign tables for? Are they harmed or helped by having pg_constraint rows? As I've mentioned when this has come up before, I think that constraints on foreign tables should be viewed as declarative statements about the contents of the foreign data that the DB will assume true. This could be useful for a variety of purposes: constraint exclusion, query optimization, etc. So pg_constraint rows for NOT NULLs are a good thing, right? Well, they aren't a bad thing, anyway. The query optimizer looks at attisnull directly in the one case where this really matters (which has to do with reordering left joins IIRC). Allowing all check constraints would certainly be a step forward, as it would allow constraint exclusion. In general, it seems to me that you're saying we should just lift the DefineRelation-enforced restriction that foreign tables ought not to have constraints. So if the user wants to specify CREATE FOREIGN TABLE people ( who person CHECK (who IS OF TYPE 'human'), .. ) server foobar; we ought to let them. Correct? Yeah, that's my view. Note that I excluded this from the initial syntax commit of foreign tables out of some feeling that there were some loose ends that weren't adequately handled by the original patch, which did allow them. I no longer remember what the deficiencies were, unfortunately. Obviously, at a minimum, we need to make sure that they are dumped-and-restored properly, displayed by psql properly, and documented. But in general +1 for allowing this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NOT NULL constraints in foreign tables
On Fri, 2012-08-17 at 15:44 -0400, Robert Haas wrote: On Fri, Aug 17, 2012 at 2:58 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I mean, what are NOT NULL in foreign tables for? Are they harmed or helped by having pg_constraint rows? As I've mentioned when this has come up before, I think that constraints on foreign tables should be viewed as declarative statements about the contents of the foreign data that the DB will assume true. This could be useful for a variety of purposes: constraint exclusion, query optimization, etc. There are at least three kinds of constraint enforcement: 1. Enforced before the query runs (e.g. the current behavior on a normal table). 2. Enforced when the query runs by validating the constraint as you go, and then throwing an error when it turns out to be false. 3. Don't make any attempt to enforce, and silently produce wrong results if it's false. Which are you proposing, and how will the user know which kind of constraint they've got? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sha1, sha2 functions into core?
I don't think US export regulations are the only issue. Some other countries (mostly the usual suspects) forbid the use of crypto software. If we build more crypto functions into the core we make it harder to use Postgres legally in those places. Again, that sounds like we need an actual legal opinion if we're going to make a decision on that basis. So let's make the decision on whether we even *want* SHA in core, and if we do, we can ask our attorneys/community if it's a legal problem. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sha1, sha2 functions into core?
On 08/20/2012 01:21 PM, Josh Berkus wrote: I don't think US export regulations are the only issue. Some other countries (mostly the usual suspects) forbid the use of crypto software. If we build more crypto functions into the core we make it harder to use Postgres legally in those places. I fail to see how that is our problem. We shouldn't make the software less useful because of those places. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table
Robert Haas robertmh...@gmail.com writes: On the other hand, the problem of the FSM taking up 24kB for an 8kB table seems clearly worth fixing, but I don't think I have the cycles for it at present. Maybe a TODO is in order. I certainly think that'd be worth a TODO. Whether the rest of this is worth worrying about I'm not sure. Surely we could just prevent creation of the FSM until the table has reached at least, say, 10 blocks. Any threshold beyond one block would mean potential space wastage, but it's hard to get excited about that until you're into the dozens of pages. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tab completion for DROP CONSTRAINT
On Sun, Aug 19, 2012 at 4:43 PM, Jeff Janes jeff.ja...@gmail.com wrote: Interactively dropping primary key constraints has been annoying me. I believe this patch fixes that, hopefully for other kinds of cataloged constraints as well. Committed, thanks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sha1, sha2 functions into core?
On 08/20/2012 04:26 PM, Joshua D. Drake wrote: On 08/20/2012 01:21 PM, Josh Berkus wrote: I don't think US export regulations are the only issue. Some other countries (mostly the usual suspects) forbid the use of crypto software. If we build more crypto functions into the core we make it harder to use Postgres legally in those places. I fail to see how that is our problem. We shouldn't make the software less useful because of those places. But there is absolutely no evidence that we are making it less useful. Postgres is designed top be extensible and we've just enhanced that. pgcrypto makes use of that. If we cen leverage that to make Postgres available to more people then why would we not do so? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CLUSTER VERBOSE tab completion
On Sun, Aug 19, 2012 at 4:55 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Fri, Aug 17, 2012 at 7:18 PM, Jeff Janes jeff.ja...@gmail.com wrote: tab completion will add USING after CLUSTER VERBOSE, as if VERBOSE were the name of a table. Instead of just making it not do the wrong thing, I tried to make it actually do the right thing. It doesn't fill in the VERBOSE for you, you have to type that in full, This short coming has now been rectified. Committed this one too. Thanks for the patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table
On Mon, Aug 20, 2012 at 4:27 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On the other hand, the problem of the FSM taking up 24kB for an 8kB table seems clearly worth fixing, but I don't think I have the cycles for it at present. Maybe a TODO is in order. I certainly think that'd be worth a TODO. Whether the rest of this is worth worrying about I'm not sure. Surely we could just prevent creation of the FSM until the table has reached at least, say, 10 blocks. Any threshold beyond one block would mean potential space wastage, but it's hard to get excited about that until you're into the dozens of pages. I dunno, I think one-row tables are pretty common. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NOT NULL constraints in foreign tables
On Mon, Aug 20, 2012 at 3:56 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2012-08-17 at 15:44 -0400, Robert Haas wrote: On Fri, Aug 17, 2012 at 2:58 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I mean, what are NOT NULL in foreign tables for? Are they harmed or helped by having pg_constraint rows? As I've mentioned when this has come up before, I think that constraints on foreign tables should be viewed as declarative statements about the contents of the foreign data that the DB will assume true. This could be useful for a variety of purposes: constraint exclusion, query optimization, etc. There are at least three kinds of constraint enforcement: 1. Enforced before the query runs (e.g. the current behavior on a normal table). 2. Enforced when the query runs by validating the constraint as you go, and then throwing an error when it turns out to be false. 3. Don't make any attempt to enforce, and silently produce wrong results if it's false. Which are you proposing, and how will the user know which kind of constraint they've got? I'm proposing #1 for regular tables, as has always been the case, and #3 for foreign tables. #1 is not a reasonable alternative for foreign tables because we lack enforcement power in that case, and #2 is also not reasonable, because the only point of allowing declarative constraints is to get better performance, and if we go with #2 then we've pretty much thrown that out the window. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sha1, sha2 functions into core?
On 08/20/2012 01:33 PM, Andrew Dunstan wrote: But there is absolutely no evidence that we are making it less useful. Postgres is designed top be extensible and we've just enhanced that. pgcrypto makes use of that. If we cen leverage that to make Postgres available to more people then why would we not do so? O.k. that is valid a valid argument. Let me counter. Everybody else does it, why don't we? PostgreSQL is extensible, modular and programmable, why are we limiting those features by not including them in core? Contrib, whether we like it or not, is not core. For some things it makes absolute sense to keep them in contrib or pgxn but cryptography is pretty much a basic core feature set at this point. MySQL, MSSQL, Oracle (not sure if integrated or as a pack) and not to mention Java and Python all have them integrated. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sha1, sha2 functions into core?
On 20 August 2012 21:26, Joshua D. Drake j...@commandprompt.com wrote: On 08/20/2012 01:21 PM, Josh Berkus wrote: I don't think US export regulations are the only issue. Some other countries (mostly the usual suspects) forbid the use of crypto software. If we build more crypto functions into the core we make it harder to use Postgres legally in those places. I fail to see how that is our problem. We shouldn't make the software less useful because of those places. Agreed. I find the idea of some secret policeman urging the use of MySQL because it doesn't have a built-in SHA-1 cryptographic hash function seems extremely far-fetched. The BitTorrent protocol uses SHA-1 to validate chunks, and it has been variously estimated that 10% - 50% of all internet traffic is BitTorrent traffic. SHA-1 is also integral to the way that git makes content effectively tamper-proof: http://www.youtube.com/watch?v=4XpnKHJAok8#t=56m -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
On Sun, Aug 19, 2012 at 6:28 PM, Jeff Davis pg...@j-davis.com wrote: The other issue is how to handle multiple changes of the same record within the transaction. Should they be stored or not? In a typical audit log, I don't see any reason to. The internals of a transaction should be implementation details; invisible to the outside, right? I'm not convinced. I'm not sure that the database user is the proper thing to be stored in the history table. Many applications usually connect to a database using some virtual user and have their own users/roles tables to handle with privileges. There should be some way to substitute the stored user in the history table with the application's one. It's also helpful to store transaction id that inserted/updated/deleted the record. If the system is recording it for audit purposes, then it better be sure that it's true. You can't allow the application to pick and choose what gets stored there. That position would render this feature useless for every application for which I would otherwise have used it. I think it's just nonsense to talk about what we can or can't let the user do. The user is in charge, and our job is to allow him to do what he wants to do more easily, not to dictate what he must do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Regards Pavel Stehule -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On Aug 20, 2012 5:11 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Was a server side implementation submitted before? I can change it, but I did it on the client side like the null display was done. Regards Pavel Stehule -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On Aug 20, 2012 5:19 PM, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:11 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Was a server side implementation submitted before? I can change it, but I did it on the client side like the null display was done. Or how about both? Regards Pavel Stehule -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On 20 August 2012 22:10, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Wouldn't a server-side solution risk breaking many things that depends on such a representation? You wouldn't be able to pick and choose what comes out of the server on a per-application basis unless you use cumbersome CASE clauses in every query that returns boolean data. It sounds like keeping it at the application level is the least disruptive, and there is a precedent for such changes, such as NULL representation. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On 20 August 2012 22:31, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:19 PM, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:11 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Was a server side implementation submitted before? I can change it, but I did it on the client side like the null display was done. Or how about both? Surely one would break the other? -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
2012/8/20 Thom Brown t...@linux.com: On 20 August 2012 22:10, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Wouldn't a server-side solution risk breaking many things that depends on such a representation? You wouldn't be able to pick and choose what comes out of the server on a per-application basis unless you use cumbersome CASE clauses in every query that returns boolean data. It sounds like keeping it at the application level is the least disruptive, and there is a precedent for such changes, such as NULL representation. It is similar to datetime formatting or to number formatting. Sometimes it is issue for some untyped languages because 't' or 'f' has not adequate sense - but it is unfriendly when somebody working with console. console based solution like proposed patch is probably the most less evil solution. Using 't' and 'f' was not a best idea. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On Aug 20, 2012 5:56 PM, Thom Brown t...@linux.com wrote: On 20 August 2012 22:31, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:19 PM, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:11 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Was a server side implementation submitted before? I can change it, but I did it on the client side like the null display was done. Or how about both? Surely one would break the other? If using both. -- Thom
Re: [HACKERS] PATCH: psql boolean display
On 20 August 2012 23:06, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:56 PM, Thom Brown t...@linux.com wrote: On 20 August 2012 22:31, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:19 PM, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:11 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Was a server side implementation submitted before? I can change it, but I did it on the client side like the null display was done. Or how about both? Surely one would break the other? If using both. Yes. :) -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On Mon, Aug 20, 2012 at 5:54 PM, Thom Brown t...@linux.com wrote: On 20 August 2012 22:10, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Wouldn't a server-side solution risk breaking many things that depends on such a representation? You wouldn't be able to pick and choose what comes out of the server on a per-application basis unless you use cumbersome CASE clauses in every query that returns boolean data. It sounds like keeping it at the application level is the least disruptive, and there is a precedent for such changes, such as NULL representation. On occasions I have wanted psql to emit the full 'True'/'False' words instead of cryptic one-letter t/f, which can get lost on long rows that get wrapped around on screen. Writing long-winded CASE expressions to get the effect is too much for small ad-hoc queries. I thought of inventing a data type whose out-function would emit these strings, and tack a ::mybool to the expression I want modified. But that would break the applications if somebody pasted the same query in an application (JDBC or some such that understands boolean) and expected a boolean data type instead of a text output of an expression. I think there's a merit to psql supporting this feature, because psql is most commonly used for ad-hoc interactive use, and true/false is more human consumable than t/f (I have had a Java developer ask me what was that 't' value in the resultset in psql). -- Gurjeet Singh
Re: [HACKERS] PATCH: psql boolean display
On Aug 20, 2012 6:08 PM, Thom Brown t...@linux.com wrote: On 20 August 2012 23:06, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:56 PM, Thom Brown t...@linux.com wrote: On 20 August 2012 22:31, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:19 PM, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:11 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Was a server side implementation submitted before? I can change it, but I did it on the client side like the null display was done. Or how about both? Surely one would break the other? If using both. Yes. :) Really server would override client. -- Thom
Re: [HACKERS] PATCH: psql boolean display
Gurjeet Singh singh.gurj...@gmail.com wrote: On occasions I have wanted psql to emit the full 'True'/'False' words instead of cryptic one-letter t/f, which can get lost on long rows that get wrapped around on screen. Writing long-winded CASE expressions to get the effect is too much for small ad-hoc queries. I thought of inventing a data type whose out-function would emit these strings, and tack a ::mybool to the expression I want modified. But that would break the applications if somebody pasted the same query in an application (JDBC or some such that understands boolean) and expected a boolean data type instead of a text output of an expression. The type itself does output true/false; it's just psql that uses t/f. test=# select 'true'::boolean::text; text -- true (1 row) test=# select 'false'::boolean::text; text --- false (1 row) That has always seemed quite odd (and occasionally inconvenient) to me. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On 20 August 2012 23:16, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 6:08 PM, Thom Brown t...@linux.com wrote: On 20 August 2012 23:06, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:56 PM, Thom Brown t...@linux.com wrote: On 20 August 2012 22:31, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:19 PM, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:11 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Was a server side implementation submitted before? I can change it, but I did it on the client side like the null display was done. Or how about both? Surely one would break the other? If using both. Yes. :) Really server would override client. Come to think of it, if the client could detect the server's bool out config, it could override the server in that instance since it would know what it was looking for, so perhaps they could coexist. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On Aug 20, 2012 6:28 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Gurjeet Singh singh.gurj...@gmail.com wrote: On occasions I have wanted psql to emit the full 'True'/'False' words instead of cryptic one-letter t/f, which can get lost on long rows that get wrapped around on screen. Writing long-winded CASE expressions to get the effect is too much for small ad-hoc queries. I thought of inventing a data type whose out-function would emit these strings, and tack a ::mybool to the expression I want modified. But that would break the applications if somebody pasted the same query in an application (JDBC or some such that understands boolean) and expected a boolean data type instead of a text output of an expression. The type itself does output true/false; it's just psql that uses t/f. test=# select 'true'::boolean::text; text -- true (1 row) test=# select 'false'::boolean::text; text --- false (1 row) That has always seemed quite odd (and occasionally inconvenient) to me. I think that may be from the cast. I didn't see any transformation in psql. Looked like it was raw output from the server. -Kevin
Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table
Robert Haas robertmh...@gmail.com writes: On Mon, Aug 20, 2012 at 4:27 PM, Tom Lane t...@sss.pgh.pa.us wrote: Surely we could just prevent creation of the FSM until the table has reached at least, say, 10 blocks. Any threshold beyond one block would mean potential space wastage, but it's hard to get excited about that until you're into the dozens of pages. I dunno, I think one-row tables are pretty common. Sure, and for that you don't need an FSM, because any row allocation attempt will default to trying the last existing block before it extends (see RelationGetBufferForTuple). It's only once you've got more than one block in the table that it becomes interesting. If we had a convention that FSM is only created for rels of more than N blocks, perhaps it'd be worthwhile to teach RelationGetBufferForTuple to try all existing blocks when relation size = N. Or equivalently, hack the FSM code to return all N pages when it has no info. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Outdated Japanse developers FAQ
Please let me know if this is not the right place to ask this kind of queston. PostgreSQL Developers FAQ in Japanese: http://wiki.postgresql.org/wiki/Developer_FAQ/ja looks pretty outdated. It was last updated on 7 November 2010 (English FAQ was last updated on 27 September 2011). Even it says PostgreSQL's repository is CVS, not git. Does anybody know who is the mainter for this? If there's no particular maintainer for this, I would like to volunteer to update the page. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sha1, sha2 functions into core?
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 If the hacker has access to the salt, then it will only slow him/her down somewhat because the search will be have to be restarted for each password. This. Further, anyone using MD5 or SHA* or any hash function for any serious storage of passwords is nuts, in this day and age. GPUs and rentable cloud computers means the ability to test billions of passwords per second is easy for anyone, salted or not. The issue is not Postgres' internal use of MD5 for passwords - that's a red herring, as it is basically no more relatively secure/insecure versus any other hashing algorithm that is not designed to be slow (e.g. bcrypt, scrypt, PBKDF2). The issue is simply exposing a more useful day to day algorithm by default. Much of the world uses SHA instead of MD5 these days for all sorts of purposes. So I am torn on this. On the one hand, having a few more things in core would be very nice, as it seems silly we have md5() as a builtin but sha256() requires a special module. But once you add sha* in, why not AES? Blowfish? Why not go the whole way and include some extremely useful ones such as bcrypt? At that point, we've deprecated pg_crypto and moved everything to core. Why I personally would love to see that someday (then we can boast built-in crypto :), I recognize that will be a very tough sell. So I will take the addition of whatever we can, including just a sha() as this thread asked for. 3) use a purposefully slow hashing function like bcrypt. but I disagree: I don't like any scheme that encourages use of low entropy passwords. Perhaps off-topic, but how to do you figure that? - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201208201849 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAlAywBwACgkQvJuQZxSWSsiS4QCbBC7X9MyQgVKC3DTKgjv0aj7D ik0AoNh1YBmhuaMXEKOP7z/GEBUR+EHe =54A2 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On Aug 20, 2012 6:31 PM, Thom Brown t...@linux.com wrote: On 20 August 2012 23:16, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 6:08 PM, Thom Brown t...@linux.com wrote: On 20 August 2012 23:06, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:56 PM, Thom Brown t...@linux.com wrote: On 20 August 2012 22:31, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:19 PM, Phil Sorber p...@omniti.com wrote: On Aug 20, 2012 5:11 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/8/20 Robert Haas robertmh...@gmail.com: On Sun, Aug 19, 2012 at 12:02 PM, Phil Sorber p...@omniti.com wrote: I am providing a patch to allow you to change the output of a boolean value in psql much like you can do with NULL. A client requested this feature and we thought it may appeal to someone else in the community. The patch includes updated docs and a regression test. The code changes themselves are pretty simple and straightforward. Example from the regression test: SELECT true, false; bool | bool --+-- t| f (1 row) \pset booltrue 'foo' \pset boolfalse 'bar' SELECT true, false; bool | bool --+-- foo | bar (1 row) \pset booltrue 't' \pset boolfalse 'f' SELECT true, false; bool | bool --+-- t| f (1 row) As always, comments welcome. Why not just do it in the SQL? SELECT CASE WHEN whatever THEN 'foo' ELSE 'bar' END AS whatever; I understand this motivation - although I was more happy with server side solution. Was a server side implementation submitted before? I can change it, but I did it on the client side like the null display was done. Or how about both? Surely one would break the other? If using both. Yes. :) Really server would override client. Come to think of it, if the client could detect the server's bool out config, it could override the server in that instance since it would know what it was looking for, so perhaps they could coexist. I think Pavel has a patch in that is meant to sync variables between client and server. Perhaps we can use the same facility? -- Thom
Re: [HACKERS] sha1, sha2 functions into core?
Joshua D. Drake j...@commandprompt.com writes: On 08/20/2012 01:33 PM, Andrew Dunstan wrote: But there is absolutely no evidence that we are making it less useful. Postgres is designed top be extensible and we've just enhanced that. pgcrypto makes use of that. If we cen leverage that to make Postgres available to more people then why would we not do so? O.k. that is valid a valid argument. Let me counter. Everybody else does it, why don't we? PostgreSQL is extensible, modular and programmable, why are we limiting those features by not including them in core? Contrib, whether we like it or not, is not core. Nonsense. By that argument, all the sweat we've expended on extensibility was wasted effort, and everything should be in core. pg_crypto's functionality is perfectly fine where it is. The fact that there might be some contexts where people actively don't want the functionality in core is just a small extra reason not to be in a hurry to merge it --- but even without that, I'd vote against this on overall project management grounds. We should be looking to push decouplable bits of functionality *out* of core, not bring them back in. The only reason I can see for pushing more crypto into core is if we needed to stop using MD5 for the core password authentication functionality. While that might come to pass eventually, I am aware of no evidence whatever that SHAn, per se, is an improvement over MD5 for password auth purposes. Moreover, as Josh just mentioned, anybody who thinks it might be insufficiently secure for their purposes has got plenty of alternatives available today (SSL certificates, PAM backed by whatever-you-want, etc). TBH, I think if we do anything at all about this in the near future, it'll be window dressing to shut up the people who heard once that MD5 was insecure and know nothing about it beyond that --- but if Postgres uses MD5 for passwords, it must be insecure. So I tend to agree with Andrew that we should wait till the NIST competition dust settles; but what I'll be looking for afterwards is which algorithm has the most street cred with the average slashdotter. Also, as I mentioned upthread, we need to do more than just drop in a new hashing algorithm. MD5 is far from being the weakest link in what we're doing today. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On Aug 20, 2012 6:28 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Gurjeet Singh singh.gurj...@gmail.com wrote: On occasions I have wanted psql to emit the full 'True'/'False' words instead of cryptic one-letter t/f, which can get lost on long rows that get wrapped around on screen. Writing long-winded CASE expressions to get the effect is too much for small ad-hoc queries. I thought of inventing a data type whose out-function would emit these strings, and tack a ::mybool to the expression I want modified. But that would break the applications if somebody pasted the same query in an application (JDBC or some such that understands boolean) and expected a boolean data type instead of a text output of an expression. The type itself does output true/false; it's just psql that uses t/f. test=# select 'true'::boolean::text; text -- true (1 row) test=# select 'false'::boolean::text; text --- false (1 row) That has always seemed quite odd (and occasionally inconvenient) to me. I think that may be from the cast. I didn't see any transformation in psql. Looked like it was raw output from the server. Right. t, f are generated in backend. See boolout() in backend/utils/adt/bool.c for more details. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Robert Haas Sent: Monday, August 20, 2012 5:04 PM To: Jeff Davis Cc: Vlad Arkhipov; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] temporal support patch On Sun, Aug 19, 2012 at 6:28 PM, Jeff Davis pg...@j-davis.com wrote: The other issue is how to handle multiple changes of the same record within the transaction. Should they be stored or not? In a typical audit log, I don't see any reason to. The internals of a transaction should be implementation details; invisible to the outside, right? I'm not convinced. Ideally the decision of whether to do so could be a client decision. Not storing intra-transaction changes is easier than storing all changes. At worse you could stage up all changed then simply fail to store all intermediate results within a given relation. It that case you gain nothing in execution performance but safe both storage and interpretative resources. So the question becomes is it worth doing without the ability to store intermediate results? If you were to ponder both which setup would the default be? If the default is the harder one (all statements) to implement then to avoid upgrade issues the syntax should specify that it is logging transactions only. Random, somewhat related, thought: I do all my working on a temporary staging table and then, as my final action, insert the resultant records onto a separate live table and drop the temporary table. Further changes to said record I perform by deleting the original then inserting a new record (from staging again) with all the values changed. Obviously this has limitations with respect to foreign keys and such but it is possible. What happens to the audit log if the PK changes and if it does not change? Any other implications that need to be address or is it like giving a loaded gun to someone and trust them to use is responsibily? I'm not sure that the database user is the proper thing to be stored in the history table. Many applications usually connect to a database using some virtual user and have their own users/roles tables to handle with privileges. There should be some way to substitute the stored user in the history table with the application's one. It's also helpful to store transaction id that inserted/updated/deleted the record. If the system is recording it for audit purposes, then it better be sure that it's true. You can't allow the application to pick and choose what gets stored there. That position would render this feature useless for every application for which I would otherwise have used it. I think it's just nonsense to talk about what we can or can't let the user do. The user is in charge, and our job is to allow him to do what he wants to do more easily, not to dictate what he must do. -- I see the user element as having two components: Client - what device/channel/user was used to connect to the database - PostgreSQL Role User - relative to that client which actual user performed the action - Application Specified A PostgreSQL role would correspond to client whereas the application would be allowed to have full control of what User value is stored. This gets a little complicated with respect to SET ROLE but gets close to the truth. The idea is that you look at the client to determine the namespace over which the user is defined and identified. So, a better way to phrase the position is that: You cannot allow the application to choose what is stored to identify itself (client) - i.e., its credentials identify who it is and those are stored without consulting the application At that point you've basically shifted responsibility for the correctness of the audit log onto that application and away from the database. However, you do provide a place for the application to store an identifier that it is able to resolve to a user if necessary. This is an arbitrary two-layer hierarchy and while conceptually anything with two layers may want more I am not sure whether the extra complexity that would entail would be worth the effort. Depending on what kinds of information you allowed to be stored for User it becomes something that can be modeled when desired and ignored otherwise. The issue with adding the PostgreSQL role to the database in this way is that you now can never delete that role or reassign it to another entity. I guess with temporal you could do so and basically have the identity-role relationship define over specific periods of time... I can (have) imagine a whole level of indirection and association to be able to reasonably handle assigning and storing permanent identities while allowing logon credentials to remain outside of permanent storage. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] PATCH: psql boolean display
Kevin Grittner kevin.gritt...@wicourts.gov writes: The type itself does output true/false; it's just psql that uses t/f. No, 't'/'f' is what boolout() returns. The 'true'/'false' results from casting bool to text are intentionally different --- IIRC, Peter E. argued successfully that this cast behavior is required by SQL spec. But we'd already been returning 't'/'f' to applications for far too many years to change it. (And that argument has not gotten any weaker since then. Keep in mind that Postgres was returning 't'/'f' for bool years before the SQL spec even had a boolean type.) If we're going to do something like this at all, I agree that psql is the place to do it, not the server. But my beef with this patch is that it's thinking too small --- why would bool be the only type that somebody would want to editorialize on the display of? I'd rather see some general substitute this for that in display of columns of type X feature. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
On 8/20/12 4:17 PM, David Johnston wrote: The issue with adding the PostgreSQL role to the database in this way is that you now can never delete that role or reassign it to another entity. I guess with temporal you could do so and basically have the identity-role relationship define over specific periods of time... I can (have) imagine a whole level of indirection and association to be able to reasonably handle assigning and storing permanent identities while allowing logon credentials to remain outside of permanent storage. This is sounding like a completely runaway spec on what should be a simple feature. If you want something in core which will be useful to a lot of our users, it needs to be simple and flexible. Not ornate with lots of dependancies. The first version of it should be as simple and minimalist as possible. Personally, I would prefer a tool which just made it simpler to build my own triggers, and made it automatic for the history table to track changes in the live table. I think anything we build which controls what goes into the history table, etc., will only narrow the user base. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Outdated Japanse developers FAQ
On Tue, Aug 21, 2012 at 7:49 AM, Tatsuo Ishii is...@postgresql.org wrote: Please let me know if this is not the right place to ask this kind of queston. PostgreSQL Developers FAQ in Japanese: http://wiki.postgresql.org/wiki/Developer_FAQ/ja looks pretty outdated. It was last updated on 7 November 2010 (English FAQ was last updated on 27 September 2011). Even it says PostgreSQL's repository is CVS, not git. Does anybody know who is the mainter for this? Itagaki-san according to the history page: http://wiki.postgresql.org/index.php?title=Developer_FAQ/jaaction=history If there's no particular maintainer for this, I would like to volunteer to update the page. Please feel free to update the page. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Outdated Japanse developers FAQ
On Tue, Aug 21, 2012 at 7:49 AM, Tatsuo Ishii is...@postgresql.org wrote: Please let me know if this is not the right place to ask this kind of queston. PostgreSQL Developers FAQ in Japanese: http://wiki.postgresql.org/wiki/Developer_FAQ/ja looks pretty outdated. It was last updated on 7 November 2010 (English FAQ was last updated on 27 September 2011). Even it says PostgreSQL's repository is CVS, not git. Does anybody know who is the mainter for this? Itagaki-san according to the history page: http://wiki.postgresql.org/index.php?title=Developer_FAQ/jaaction=history If there's no particular maintainer for this, I would like to volunteer to update the page. Please feel free to update the page. Ok, I will do it with my colleagues. BTW, user's FAQ is also outdated (last update was 16 May 2010). Unfortunately I don't have time to work on it. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sha1, sha2 functions into core?
On 08/20/2012 07:08 PM, Tom Lane wrote: Moreover, as Josh just mentioned, anybody who thinks it might be insufficiently secure for their purposes has got plenty of alternatives available today (SSL certificates, PAM backed by whatever-you-want, etc). Yeah, I think we need to emphasize this lots more. Anyone who wants really secure authentication needs to be getting away from password based auth altogether. Another hash function will make very little difference. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
Josh Berkus j...@agliodbs.com wrote: This is sounding like a completely runaway spec on what should be a simple feature. I hate to contribute to scope creep (or in this case scope screaming down the tracks at full steam), but I've been watching this with a queasy feeling about interaction with Serializable Snapshot Isolation (SSI). Under SSI the apparent order of execution is not always the transaction commit order, or the transaction start order. So a temporal database would be vulnerable to seeing anomalies like this one unless rw-conflicts (as tracked with predicate locks) are considered: http://wiki.postgresql.org/wiki/SSI#Deposit_Report This raises something I talked vaguely about in Ottawa this year, although it was pretty much at the hand-waving stage and I don't know how well I got the idea across. I've been thinking about the problems with all the various replication technologies being able to present data consistent with serializable transactions, and have the outlines of a technique I think might be more palatable to the community that those previously discussed. Basically, it would involve generating a list of committed XIDs in *apparent order of execution*, and creating snapshots on the replicas based on that instead of just the master's transaction commit order. I've been trying to work through the details to the point where I can present a coherent write-up on it. I wouldn't want to hold up a feature like temporal queries on the basis that it didn't immediately play nice with SSI, but it seems like it would be a good thing if the view of the past wasn't too strictly tied to transaction commit sequence; a little bit of abstraction there might save a lot of pain in tying these features together. Maybe something along the lines of a transaction visibility sequence number, or *maybe* a timestamptz works as long as that can be fudged to a time after the commit time for transactions involved in rw-conflicts with concurrent transactions. (I'm not sure microsecond resolution works for other, reasons, but if it does...) I think either could work. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sha1, sha2 functions into core?
On 08/20/2012 05:12 PM, Andrew Dunstan wrote: On 08/20/2012 07:08 PM, Tom Lane wrote: Moreover, as Josh just mentioned, anybody who thinks it might be insufficiently secure for their purposes has got plenty of alternatives available today (SSL certificates, PAM backed by whatever-you-want, etc). Yeah, I think we need to emphasize this lots more. Anyone who wants really secure authentication needs to be getting away from password based auth altogether. Another hash function will make very little difference. Actually, I concede here. If we were pushing our other abilities more visibly, I don't know that this argument would ever come up. Sincerely, Joshua D. Drake cheers andrew -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] TODO
I found this in https://wiki.postgresql.org/wiki/Todo : Improve ability to display optimizer analysis using OPTIMIZER_DEBUG What does this actually mean? Add GUC switch to enable optimizer debug on/off? More fancy/useful info should be printed? If so, what kind of information is required? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql boolean display
On Mon, Aug 20, 2012 at 7:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: The type itself does output true/false; it's just psql that uses t/f. No, 't'/'f' is what boolout() returns. The 'true'/'false' results from casting bool to text are intentionally different --- IIRC, Peter E. argued successfully that this cast behavior is required by SQL spec. But we'd already been returning 't'/'f' to applications for far too many years to change it. (And that argument has not gotten any weaker since then. Keep in mind that Postgres was returning 't'/'f' for bool years before the SQL spec even had a boolean type.) If we're going to do something like this at all, I agree that psql is the place to do it, not the server. But my beef with this patch is that it's thinking too small --- why would bool be the only type that somebody would want to editorialize on the display of? I'd rather see some general substitute this for that in display of columns of type X feature. regards, tom lane Sorry, was on my phone before and couldn't type the response I wanted to. I like where your head is with the more general case, and I tried to think along those lines too, but I could not come up with a grand unifying way to do even null and boolean together, never mind other ways that I hadn't even thought about. The boolean case is a single datatype where the null case crosses all nullable types. With null you only have to handle one case, with boolean, you have two. What I settled upon was the simplest way I could think of with the most flexibility. In my opinion, the utility of this patch outweigh's the niche implementation. I'd welcome other approaches that covered this in a more generic way, I just can't think of any that aren't overly complex. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
On Mon, 2012-08-20 at 17:04 -0400, Robert Haas wrote: On Sun, Aug 19, 2012 at 6:28 PM, Jeff Davis pg...@j-davis.com wrote: The other issue is how to handle multiple changes of the same record within the transaction. Should they be stored or not? In a typical audit log, I don't see any reason to. The internals of a transaction should be implementation details; invisible to the outside, right? I'm not convinced. As I understand it, we are talking about recording data changes in one table to another table. Auditing of reads or the logging of raw statements seem like very different kinds of projects to me; but tell me if you think differently. So if we are recording data changes, I don't see much point in recording uncommitted changes. Perhaps my imagination is failing, and someone else can fill me in on a use case. I'm also struggling with the semantics: if we record uncommitted changes, do we record them even if the transaction aborts? If so, what guarantees do we offer about the change actually being recorded? I'm not sure that the database user is the proper thing to be stored in the history table. Many applications usually connect to a database using some virtual user and have their own users/roles tables to handle with privileges. There should be some way to substitute the stored user in the history table with the application's one. It's also helpful to store transaction id that inserted/updated/deleted the record. If the system is recording it for audit purposes, then it better be sure that it's true. You can't allow the application to pick and choose what gets stored there. That position would render this feature useless for every application for which I would otherwise have used it. We could offer a GUC like audit_context or audit_app_context that takes a text string, and the audit log would record the value stored in that GUC along with the data changes in question. The main thing I object to is an implication that the system is vouching for some particular fact that is supplied by a userset GUC. Remember, there are guaranteed to be application-level problems that allow these GUCs to get set improperly for all kinds of reasons. We don't want bug reports along the lines of security breach! PG allows application_name to be spoofed in the audit log!. Also, I'd prefer not use existing GUCs, because there may be all kinds of other reasons that people set existing GUCs, and we want them to be able to handle the audit_context one more carefully and have a clear warning in the documentation. I think it's just nonsense to talk about what we can or can't let the user do. The user is in charge, and our job is to allow him to do what he wants to do more easily, not to dictate what he must do. Remember that the users who depend on the veracity of the audit log are users, too. Let's try to serve both classes of user if we can. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
On Mon, 2012-08-20 at 19:17 -0400, David Johnston wrote: Ideally the decision of whether to do so could be a client decision. Not storing intra-transaction changes is easier than storing all changes. At worse you could stage up all changed then simply fail to store all intermediate results within a given relation. It that case you gain nothing in execution performance but safe both storage and interpretative resources. So the question becomes is it worth doing without the ability to store intermediate results? If you were to ponder both which setup would the default be? If the default is the harder one (all statements) to implement then to avoid upgrade issues the syntax should specify that it is logging transactions only. I think the biggest question here is what guarantees can be offered? What if the transaction aborts after having written some data, does the audit log still get updated? I see the user element as having two components: I think this is essentially a good idea, although as I said in my other email, we should be careful how we label the application-supplied information in the audit log. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
On Mon, 2012-08-20 at 16:32 -0700, Josh Berkus wrote: This is sounding like a completely runaway spec on what should be a simple feature. My feeling as well. However, we will eventually want to coalesce around some best practices and make it easy and robust for typical cases. Personally, I would prefer a tool which just made it simpler to build my own triggers, and made it automatic for the history table to track changes in the live table. I think anything we build which controls what goes into the history table, etc., will only narrow the user base. That sounds like a good way to start. Actually, even before the tool, how about just some really good examples of triggers for specific kinds of audit logs, and some ways to run queries on them? I think that might settle a lot of these details. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
On Mon, 2012-08-20 at 19:32 -0500, Kevin Grittner wrote: Josh Berkus j...@agliodbs.com wrote: This is sounding like a completely runaway spec on what should be a simple feature. I hate to contribute to scope creep (or in this case scope screaming down the tracks at full steam), but I've been watching this with a queasy feeling about interaction with Serializable Snapshot Isolation (SSI). There are all kinds of challenges here, and I'm glad you're thinking about them. I alluded to some problems here: http://archives.postgresql.org/message-id/1345415312.20987.56.camel@jdavis But those might be a subset of the problems you're talking about. It sounds like, at a high level, there are two problems: 1. capturing the apparent order of execution in the audit log 2. assigning meaningful times to the changes that are consistent with the apparent order of execution Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
On 08/21/2012 12:52 PM, Jeff Davis wrote: On Mon, 2012-08-20 at 16:32 -0700, Josh Berkus wrote: This is sounding like a completely runaway spec on what should be a simple feature. My feeling as well. However, we will eventually want to coalesce around some best practices and make it easy and robust for typical cases. Personally, I would prefer a tool which just made it simpler to build my own triggers, and made it automatic for the history table to track changes in the live table. I think anything we build which controls what goes into the history table, etc., will only narrow the user base. FWIW, I've found the ability to exclude columns from my history triggers to be important because of: - optimistic locking fields used by some clients; and - Trigger-maintained summary fields Without being able to apply some exclusions there's just too much churn in the history of some tables. Here's what I'm using at the moment: http://wiki.postgresql.org/wiki/Audit_trigger_91plus (I know storing both the relation oid and the text-form table and schema name is redundant. The text is handy if the table is dropped and recreated, though, and the oid is quicker easier much of the time). I use both the per-query and per-row forms depending on the granularity I need. -- Craig Ringer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NOT NULL constraints in foreign tables
On Mon, 2012-08-20 at 16:50 -0400, Robert Haas wrote: #3 for foreign tables. I'm skeptical of that approach for two reasons: (1) It will be hard to inform users which constraints are enforced and which aren't. (2) It will be hard for users to understand the planner benefits or the consequences when the constraint is not enforced. That being said, I can imagine good use cases (like when the foreign table is in postgres, and already has that constraint declared), so I'm not outright opposed to it. #1 is not a reasonable alternative for foreign tables because we lack enforcement power in that case, Right. and #2 is also not reasonable, because the only point of allowing declarative constraints is to get better performance, and if we go with #2 then we've pretty much thrown that out the window. Declared constraints can improve the plans, while runtime-enforced constraints slow down execution of a given plan. I'm not really sure whether runtime enforcement is a good trade-off, but it doesn't seem like an obviously bad one. Also, what did you mean by the only point of allowing declarative constraints is to get better performance? Maybe the user wants to get an error if some important assumption about the remote data source is not as true as when they declared the constraint. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
On 08/21/2012 12:52 PM, Jeff Davis wrote: On Mon, 2012-08-20 at 16:32 -0700, Josh Berkus wrote: This is sounding like a completely runaway spec on what should be a simple feature. My feeling as well. However, we will eventually want to coalesce around some best practices and make it easy and robust for typical cases. Personally, I would prefer a tool which just made it simpler to build my own triggers, and made it automatic for the history table to track changes in the live table. I think anything we build which controls what goes into the history table, etc., will only narrow the user base. That sounds like a good way to start. Actually, even before the tool, how about just some really good examples of triggers for specific kinds of audit logs That reminds me: The single biggest improvement I can see for audit triggers would be to provide an _easy_ and _efficient_ way to test whether any fields have changed between OLD and NEW *except* for one or more ignored fields. Right now if I have a four-column table and I want to ignore UPDATEs to col2 for audit purposes, I have to write: CREATE TRIGGER tablename_audit_insert_delete AFTER INSERT OR DELETE ON sometable FOR EACH ROW EXECUTE PROCEDURE audit_func(); CREATE TRIGGER tablename_audit_update_selective AFTER UPDATE ON sometable FOR EACH ROW WHEN ( OLD.col1 IS DISTINCT FROM NEW.col1 OR OLD.col3 IS DISTINCT FROM NEW.col3 OR OLD.col4 IS DISTINCT FROM NEW.col4 OR ) EXECUTE PROCEDURE audit.if_modified_func(); ... which is horrible for all sorts of reasons: - If a column is added the audit trigger also needs an update to test for it, otherwise it'll be ignored; - It isn't explicit that col2 is ignored; and - I have to repeat my trigger definitions twice. An alternative is to create hstores from OLD and NEW, delete the field of interest, and compare them. That's pretty slow though, and may duplicate work done by the already-expensive audit trigger. What I'm imagining is something like a: row_equals_ignorecols(OLD, NEW, 'col2') ... which would solve half the problem, and is simple enough I could implement it with a little C function. A way to avoid splitting the trigger function definition and a built-in compare rows except columns would be great, though. -- Craig Ringer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers