Re: [HACKERS] patch for type privileges

2012-05-20 Thread Peter Eisentraut
On lör, 2011-12-10 at 16:16 +0100, Yeb Havinga wrote:
 
 * Cannot restrict access to array types. After revoking usage from the 
 element type, the error is perhaps a bit misleading. (smallint[] vs 
 smallint)
 
 postgres= create table a (a int2[]);
 ERROR:  permission denied for type smallint[]

This matter was still outstanding.  The problem with fixing this is that
you need to duplicate the array type to element type conversion in two
dozen places.  So I have refactored this into a separate function, which
also takes care of the call to format_type_be, which is equally
duplicated in as many places.

diff --git i/src/backend/access/common/tupdesc.c w/src/backend/access/common/tupdesc.c
index 1f40b7c..aa1ce80 100644
--- i/src/backend/access/common/tupdesc.c
+++ w/src/backend/access/common/tupdesc.c
@@ -573,8 +573,7 @@
 
 		aclresult = pg_type_aclcheck(atttypid, GetUserId(), ACL_USAGE);
 		if (aclresult != ACLCHECK_OK)
-			aclcheck_error(aclresult, ACL_KIND_TYPE,
-		   format_type_be(atttypid));
+			aclcheck_error_type(aclresult, atttypid);
 
 		attcollation = GetColumnDefCollation(NULL, entry, atttypid);
 		attdim = list_length(entry-typeName-arrayBounds);
diff --git i/src/backend/catalog/aclchk.c w/src/backend/catalog/aclchk.c
index 9315e79..89b71b4 100644
--- i/src/backend/catalog/aclchk.c
+++ w/src/backend/catalog/aclchk.c
@@ -3389,6 +3389,19 @@ static AclMode pg_aclmask(AclObjectKind objkind, Oid table_oid, AttrNumber attnu
 }
 
 
+/*
+ * Special common handling for types: use element type instead of array type,
+ * and format nicely
+ */
+void
+aclcheck_error_type(AclResult aclerr, Oid typeOid)
+{
+	Oid element_type = get_element_type(typeOid);
+
+	aclcheck_error(aclerr, ACL_KIND_TYPE, format_type_be(element_type ? element_type : typeOid));
+}
+
+
 /* Check if given user has rolcatupdate privilege according to pg_authid */
 static bool
 has_rolcatupdate(Oid roleid)
diff --git i/src/backend/catalog/objectaddress.c w/src/backend/catalog/objectaddress.c
index 250069f..3c3fd05 100644
--- i/src/backend/catalog/objectaddress.c
+++ w/src/backend/catalog/objectaddress.c
@@ -932,8 +932,7 @@ static ObjectAddress get_object_address_opcf(ObjectType objtype, List *objname,
 		case OBJECT_DOMAIN:
 		case OBJECT_ATTRIBUTE:
 			if (!pg_type_ownercheck(address.objectId, roleid))
-aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_TYPE,
-			   format_type_be(address.objectId));
+aclcheck_error_type(ACLCHECK_NOT_OWNER, address.objectId);
 			break;
 		case OBJECT_AGGREGATE:
 		case OBJECT_FUNCTION:
diff --git i/src/backend/catalog/pg_aggregate.c w/src/backend/catalog/pg_aggregate.c
index 9ff70a5..82a2c9f 100644
--- i/src/backend/catalog/pg_aggregate.c
+++ w/src/backend/catalog/pg_aggregate.c
@@ -208,19 +208,16 @@ static Oid lookup_agg_function(List *fnName, int nargs, Oid *input_types,
 	{
 		aclresult = pg_type_aclcheck(aggArgTypes[i], GetUserId(), ACL_USAGE);
 		if (aclresult != ACLCHECK_OK)
-			aclcheck_error(aclresult, ACL_KIND_TYPE,
-		   format_type_be(aggArgTypes[i]));
+			aclcheck_error_type(aclresult, aggArgTypes[i]);
 	}
 
 	aclresult = pg_type_aclcheck(aggTransType, GetUserId(), ACL_USAGE);
 	if (aclresult != ACLCHECK_OK)
-		aclcheck_error(aclresult, ACL_KIND_TYPE,
-	   format_type_be(aggTransType));
+		aclcheck_error_type(aclresult, aggTransType);
 
 	aclresult = pg_type_aclcheck(finaltype, GetUserId(), ACL_USAGE);
 	if (aclresult != ACLCHECK_OK)
-		aclcheck_error(aclresult, ACL_KIND_TYPE,
-	   format_type_be(finaltype));
+		aclcheck_error_type(aclresult, finaltype);
 
 
 	/*
diff --git i/src/backend/commands/functioncmds.c w/src/backend/commands/functioncmds.c
index 5f1c19e..2a075a1 100644
--- i/src/backend/commands/functioncmds.c
+++ w/src/backend/commands/functioncmds.c
@@ -154,8 +154,7 @@ static void AlterFunctionOwner_internal(Relation rel, HeapTuple tup,
 
 	aclresult = pg_type_aclcheck(rettype, GetUserId(), ACL_USAGE);
 	if (aclresult != ACLCHECK_OK)
-		aclcheck_error(aclresult, ACL_KIND_TYPE,
-	   format_type_be(rettype));
+		aclcheck_error_type(aclresult, rettype);
 
 	*prorettype_p = rettype;
 	*returnsSet_p = returnType-setof;
@@ -247,8 +246,7 @@ static void AlterFunctionOwner_internal(Relation rel, HeapTuple tup,
 
 		aclresult = pg_type_aclcheck(toid, GetUserId(), ACL_USAGE);
 		if (aclresult != ACLCHECK_OK)
-			aclcheck_error(aclresult, ACL_KIND_TYPE,
-		   format_type_be(toid));
+			aclcheck_error_type(aclresult, toid);
 
 		if (t-setof)
 			ereport(ERROR,
@@ -1509,13 +1507,11 @@ static void AlterFunctionOwner_internal(Relation rel, HeapTuple tup,
 
 	aclresult = pg_type_aclcheck(sourcetypeid, GetUserId(), ACL_USAGE);
 	if (aclresult != ACLCHECK_OK)
-		aclcheck_error(aclresult, ACL_KIND_TYPE,
-	   format_type_be(sourcetypeid));
+		aclcheck_error_type(aclresult, sourcetypeid);
 
 	aclresult = pg_type_aclcheck(targettypeid, GetUserId(), ACL_USAGE);
 	if (aclresult != ACLCHECK_OK)
-		aclcheck_error(aclresult, ACL_KIND_TYPE,
-	   

Re: [HACKERS] temporal support patch

2012-05-20 Thread Pavel Stehule
Hello

2012/5/18 Miroslav Šimulčík simulcik.m...@gmail.com:
 Hello.

 SQL 2011 standard wasn't available in time I started this project so I built
 my implementation on older standards TSQL2 and SQL/Temporal, that were only
 available. None of these were accepted by ANSI/ISO commissions however.

 There is different syntax in SQL 2011 and it looks like one that IBM DB2 had
 been using even before this standard were published.

 So my implementation differs in syntax, but features are same as stated in
 system versioned tables part of slideshow.


I would to see temporal functionality in pg, but only in SQL 2011
syntax. Using syntax from deprecated proposals has no sense. I am not
sure so history table concept is best from performance view - it is
simpler for implementation, but you duplicate all indexes - there will
be lot of redundant fields in history table. A important query is
difference in cost for some non trivial query for actual data and same
query for historic data.

Regards

Pavel Stehule

 Regards
 Miroslav Simulcik


 2012/5/17 Pavel Stehule pavel.steh...@gmail.com

 Hello

 what is conformance of your solution with temporal extension in ANSI SQL
 2011


 http://www.slideshare.net/CraigBaumunk/temporal-extensions-tosql20112012010438

 Regards

 Pavel Stehule

 2012/5/16 Miroslav Šimulčík simulcik.m...@gmail.com:
  Hi all,
 
  as a part of my master's thesis I have created temporal support patch
  for
  PostgreSQL. It enables the creation of special temporal tables with
  entries
  versioning. Modifying operations (UPDATE, DELETE, TRUNCATE) on these
  tables
  don't cause permanent changes to entries, but create new versions of
  them.
  Thus user can easily get to the past states of the table.
 
  Basic information on temporal databases can be found
  on http://en.wikipedia.org/wiki/Temporal_database
 
  In field of temporal databases, there are only proprietary solution
  available. During the analysis I found these:
      - IBM DB2 10 for z/OS
      - Oracle 11g Workspace Manager
      - Teradata Database 13.10
 
  Primary goal of my work was the creation of opensource solution, that is
  easy to use and is backward compatible with existing applications,
  so that
  the change of the original tables to temporal ones, does not require
  changes
  to applications that work with them. This patch is built on standard
  SQL/Temporal with some minor modifications inspired by commercial
  temporal
  database systems. Currently it only deals with transaction time support.
 
  Here is simple description on how it works:
 
  1. user can create transaction time table using modified CREATE TABLE
  command:
 
      CREATE TABLE person(name varchar(50)) AS TRANSACTIONTIME;
 
      This command automatically creates all objects required for
  transaction
  time support:
 
                    List of relations
    Schema |         Name         |   Type   |  Owner
       +--+--+--
    public | person               | table    | tester
    public | person__entry_id_seq | sequence | tester
    public | person_hist          | table    | postgres
 
 
                                                    Table public.person
      Column   |            Type             |
       Modifiers
 
 
     +-+--
    name       | character varying(50)       |
    _entry_id  | bigint                      | not null default
  nextval('person__entry_id_seq'::regclass)
    _sys_start | timestamp without time zone | not null default
  clock_timestamp()
    _sys_end   | timestamp without time zone | not null default
  '294276-12-31 23:59:59.99'::timestamp without time zone
       Indexes:
       person__entry_id_idx btree (_entry_id)
       person__sys_start__sys_end_idx btree (_sys_start, _sys_end)
 
 
                Table public.person_hist
      Column   |            Type             | Modifiers
       +-+---
    name       | character varying(50)       |
    _entry_id  | bigint                      | not null
    _sys_start | timestamp without time zone | not null
    _sys_end   | timestamp without time zone | not null
       Indexes:
       person_hist__entry_id_idx btree (_entry_id)
       person_hist__sys_start__sys_end_idx btree (_sys_start,
  _sys_end)
 
 
 
 
      Table person stores current versions of entries. 3 additional
  columns
  are added:
          _entry_id - id of entry. It groups together different versions
  of
  entry.
          _sys_start - beginning of the version validity period (version
  creation timestamp).
          _sys_end - end of the version validity period.
 
      Table person_hist stores historical versions of entries. It has the
  same
  structure and indexes as the person table, but without any constraints
  and
  

Re: [HACKERS] weird error message in sepgsql

2012-05-20 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 I found this in contrib/sepgsql/expected/label.out:
 SECURITY LABEL ON COLUMN t2
 IS 'system_u:object_r:sepgsql_ro_table_t:s0';   -- be failed
 ERROR:  improper relation name (too many dotted names): nothing

 Contrast with:

 SECURITY LABEL ON COLUMN t2.b
 IS 'system_u:object_r:sepgsql_ro_table_t:s0';   -- ok

 I guess what's happening is that it's calling makeRangeVarFromNameList()
 with a list of length zero.

 We should either fix the SECURITY LABEL command to catch that case
 beforehand, or fix makeRangeVarFromNameList() to give a proper error
 message, or both.

I think the appropriate error message is probably along the lines of
column name must be qualified, and it's hard to justify having
makeRangeVarFromNameList emit such a thing.  So this is the fault
of the calling code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove readline notice from psql --version?

2012-05-20 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Currently, psql --version prints something like
 psql (PostgreSQL) 9.2beta1
 contains support for command-line editing

 I think the notice about readline is a leftover from the old days when
 psql was often built without any readline support.  Nowadays, this looks
 like an anomaly, and it doesn't actually contain any information that
 would be useful nowadays, such as which particular library is used or
 what kind of history support is active.  I suggest we remove this second
 line, or alternatively, we could add more useful detail, but what?

Hm, I had actually forgotten that was there.  When the question of
which readline library does this use? has come up in the past,
we've always been able to get an answer from ldd or local equivalent.
I suppose that could fail in the case of statically linked libraries,
but the practical use for the message seems about nil.  I'm good with
just removing it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] read() returns ERANGE in Mac OS X

2012-05-20 Thread Alvaro Herrera

Excerpts from Florian Pflug's message of sáb may 19 03:48:51 -0400 2012:
 
 On May18, 2012, at 23:18 , Alvaro Herrera wrote:
  Excerpts from Florian Pflug's message of jue may 17 09:08:26 -0400 2012:

  Seems to me that we could make zero_damaged_pages an enum.  The default
  value of on would only catch truncated-away pages; another value would
  also capture kernel-level error conditions.
 
 Yeah, an enum would be nicer than an additional GUC. I kinda keep forgetting
 that we have those. Though to bikeshed, the GUC should probably be just called
 'zero_pages' and take the values 'never', 'missing', 'unreadable' ;-)

Sounds reasonable to me ..

  The thing is, once you start getting kernel-level errors you're pretty
  much screwed and there's no way to just recover whatever data is
  recoverable.
 
 I thought your initial gripe was precisely that you got a kernel-level error,
 yet the filesystem was still in pretty good shape?

Uhm.  I'm not really sure what's the actual problem, but I think it is
precisely a corrupted filesystem.

 Which actually seemed quite likely to me - the cause could be, for example,
 simply a single bad block. Or a filesystem-level checksum error if you're 
 using
 a filesystem with built-in integrity checks.

I guess ERANGE is the sort of thing that's not quite expected here -- I
mean you might get EIO if there's an I/O problem such as a checksum
error, but ERANGE suggests to me that the kernel might be leaking some
internal error that's not supposed to be thrown to the user.

In any case I don't think we can distinguish kernel-level problems such
as this one, from filesystem level problems.  I mean, they all come from
the kernel, as far as we're concerned.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is indexonlyscan so darned slow?

2012-05-20 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 On Thu, May 17, 2012 at 11:35 AM, Joshua Berkus j...@agliodbs.com wrote:
 That's in-RAM speed ... I ran the query twice to make sure the index was 
 cached, and it didn't get any better.  And I meant 5X per byte rather than 
 5X per tuple.

 Ah, OK that makes more sense.  I played around with this, specifically
 count(*), quite a bit when IOS first came out, and I attributed a
 large part of the time to the code that forms a tuple out of raw
 bytes, and the code that advances the aggregate.  The first one is
 probably more a per-tuple cost than per byte, and the second
 definitely is per tuple cost.
 I can't find my detailed notes from this work, so this is just from memory.

I did a quick investigation of this example with oprofile, and found
that there's not going to be any easy large improvement available.
It's basically all per-tuple CPU costs, breaking down like this:

samples  %symbol name
1551313.4664  IndexOnlyNext
10886 9.4498  index_getnext_tid
7526  6.5331  visibilitymap_test
7116  6.1772  ExecClearTuple
7054  6.1234  _bt_checkkeys
6804  5.9064  _bt_next
6344  5.5070  ExecProject
6033  5.2371  advance_aggregates
5619  4.8777  ExecScan
5331  4.6277  advance_transition_function
5202  4.5157  btgettuple
4928  4.2779  _bt_saveitem
4653  4.0391  ExecProcNode
4473  3.8829  int8inc
3404  2.9549  MemoryContextReset
3125  2.7127  _bt_readpage
2768  2.4028  FunctionCall2Coll
2278  1.9775  ExecAgg
1502  1.3038  ExecStoreVirtualTuple
1198  1.0399  BufferGetBlockNumber
1105  0.9592  ExecIndexOnlyScan
946   0.8212  hash_search_with_hash_value

A fair chunk of what's being attributed to IndexOnlyNext is actually the
inlined code for StoreIndexTuple, and that in turn is mostly the inlined
code for index_getattr.  We might possibly save a bit here if we noticed
that the query doesn't actually need us to fetch the indexed columns,
but it looks like that would only buy a couple percent overall --- and
testing for that would add useless cycles to every case where we *do*
need the indexed value.  So I'm afraid that it might amount to optimizing
SELECT COUNT(*) at the expense of everything else, which I'm not for.

Another possibility is to try to reduce the costs of index_getnext_tid
and FunctionCall2Coll, which are basically just trampolines to reach
btgettuple.  It's not immediately obvious how to make that much better
though.

Anyway, on my machine it seems that the per-tuple CPU costs for SELECT
COUNT(*) with an index-only scan are something like 10% higher than the
per-tuple costs with a heap scan.  We might get that down to roughly par
with some hacking, but it's never going to be vastly better.  The
argument in favor of index-only scans is mainly about reducing I/O costs
anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove readline notice from psql --version?

2012-05-20 Thread David Fetter
On Sun, May 20, 2012 at 01:24:21AM +0300, Peter Eisentraut wrote:
 Currently, psql --version prints something like
 
 psql (PostgreSQL) 9.2beta1
 contains support for command-line editing

I think this should be replaced with a notice about the actual library
used.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Schema version management

2012-05-20 Thread Joel Jacobson
Hi,

I just read a very interesting post about schema version management.

Quote: You could set it up so that every developer gets their own
test database, sets up the schema there, takes a dump, and checks that
in. There are going to be problems with that, including that dumps
produced by pg_dump are ugly and optimized for restoring, not for
developing with, and they don't have a deterministic output order. (
http://petereisentraut.blogspot.com/2012/05/my-anti-take-on-database-schema-version.html
)

Back in December 2010, I suggested a new option to pg_dump, --split,
which would write the schema definition of each object in separate
files:

http://archives.postgresql.org/pgsql-hackers/2010-12/msg02285.php

Instead of a huge plain text schema file, impossible to version
control, all tables/sequences/views/functions are written to separate
files, allowing the use of a version control software system, such as
git, to do proper version controlling.

The deterministic output order problem mentioned in the post above,
is not a problem if each object (table/sequence/view/function/etc) is
written to the same filename everytime.
No matter the order, the tree of files and their content will be
identical, no matter the order in which they are dumped.

I remember a lot of hackers were very positive about this option, but
we somehow failed to agree on the naming of files in the tree
structure. I'm sure we can work that out though.

I use this feature in production, I have a cronjob which does a dump
of the schema every hour, committing any eventual changes to a
separate git branch for each database installation, such as
production, development and test.
If no changes to the schema have been made, nothing will be committed
to git since none of the files have changed.

It is then drop-dead simple to diff two different branches of the
database schema, such as development or production, or diffing
different revisions allowing point-in-time comparison of the schema.

This is an example of the otuput of a git log --summary for one of the
automatic commits to our production database's git-repo:

--
commit 18c31f8162d851b0dac3bad7e80529ef2ed18be3
Author: Production Database production.datab...@trustly.com
Date:   Fri May 4 15:00:04 2012 +0200

    Update of database schema Linux DB0 2.6.26-2-amd64 #1 SMP Wed Aug
19 22:33:18 UTC 2009 x86_64 GNU/Linux Fri, 04 May 2012 15:00:04 +0200

 create mode 100644
gluepay-split/public/CONSTRAINT/openingclosingbalances_pkey.sql
 create mode 100644
gluepay-split/public/CONSTRAINT/openingclosingbalances_source_key.sql
 create mode 100644 gluepay-split/public/SEQUENCE/seqopeningclosingbalance.sql
 create mode 100644 gluepay-split/public/TABLE/openingclosingbalances.sql
--

Here we can see we apparently deployed a new table,
openingclosingbalances around Fri May 4 15:00:04.

Without any manual work, I'm able to follow all changes actually
_deployed_ in each database.

At my company, a highly database-centric stored-procedure intensive
business dealing with mission-critical monetary transactions, we've
been using this technique to successfully do schema version management
without any hassle for the last two years.

Hopefully this can add to the list of various possible _useful_ schema
version management methods.

Best regards,

Joel Jacobson

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is indexonlyscan so darned slow?

2012-05-20 Thread Robert Haas
On Sun, May 20, 2012 at 3:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Another possibility is to try to reduce the costs of index_getnext_tid
 and FunctionCall2Coll, which are basically just trampolines to reach
 btgettuple.  It's not immediately obvious how to make that much better
 though.

Hmm... this seems awfully similar to the problem we tried to solve
with the sortsupport infrastructure.  Maybe something similar would be
warranted here, to save the overhead of repeated argument packing and
unpacking.

Here's some 'perf' results from the IBM POWER7 box:

 10.01% postgres  postgres   [.] visibilitymap_test
  8.78% postgres  postgres   [.] IndexOnlyNext
  7.85% postgres  postgres   [.] btgettuple
  5.67% postgres  postgres   [.] ExecProject
  5.56% postgres  postgres   [.] ExecProcNode
  5.51% postgres  postgres   [.] advance_transition_function
  5.06% postgres  postgres   [.] advance_aggregates
  5.02% postgres  postgres   [.] ExecScan
  4.43% postgres  postgres   [.] FunctionCall2Coll
  4.11% postgres  postgres   [.] _bt_checkkeys
  3.54% postgres  postgres   [.] ExecClearTuple
  3.42% postgres  postgres   [.] int8inc
  3.25% postgres  postgres   [.] _bt_next
  3.19% postgres  postgres   [.] MemoryContextReset
  2.95% postgres  postgres   [.] index_getnext_tid
  2.81% postgres  postgres   [.] _bt_readpage
  2.43% postgres  postgres   [.] _bt_saveitem
  2.42% postgres  postgres   [.] ExecIndexOnlyScan
  2.32% postgres  libc-2.14.90.so[.] memcpy

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-05-20 Thread Daniel Farina
On Sun, May 20, 2012 at 12:41 PM, Joel Jacobson j...@trustly.com wrote:
 Hi,

 I just read a very interesting post about schema version management.

 Quote: You could set it up so that every developer gets their own
 test database, sets up the schema there, takes a dump, and checks that
 in. There are going to be problems with that, including that dumps
 produced by pg_dump are ugly and optimized for restoring, not for
 developing with, and they don't have a deterministic output order. (
 http://petereisentraut.blogspot.com/2012/05/my-anti-take-on-database-schema-version.html
 )

I think you are absolutely right, but I'm not sure if teaching pg_dump
a new option is the best idea.  It's a pretty complex program as-is.
I've also heard some people who really wish pg knew how to self-dump
for valid reasons.

It sounds like some of the catalog wrangling and cycle-breaking
properties of pg_dump could benefit from being exposed stand-alone,
but unfortunately that's not a simple task, especially if you want to
do The Right Thing and have pg_dump link that code, given pg_dump's
criticality.

pg_extractor is a new/alternative take on the database copying
problem, maybe you could have a look at that?

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-05-20 Thread Joel Jacobson
On Mon, May 21, 2012 at 8:08 AM, Daniel Farina dan...@heroku.com wrote:
 I think you are absolutely right, but I'm not sure if teaching pg_dump
 a new option is the best idea.  It's a pretty complex program as-is.
 I've also heard some people who really wish pg knew how to self-dump
 for valid reasons.

Complex program? Yes, pg_dump it is extremely complex, I wouldn't want
to touch any of the code. A rewrite is probably close to impossible.

Complex patch? No. It's 102 lines of code and doesn't change any of
the existing code in pg_dump, it simply adds some lines writing out
the objects to separate files. Have a look at the patch, it's super
simple.

 It sounds like some of the catalog wrangling and cycle-breaking
 properties of pg_dump could benefit from being exposed stand-alone,
 but unfortunately that's not a simple task, especially if you want to
 do The Right Thing and have pg_dump link that code, given pg_dump's
 criticality.

I agree it's not a simple task, and it's probably not something anyone
will fix in the near future.
The --split option doesn't aim to solve this problem either. That's a
different problem, and it's not a problem I have.

 pg_extractor is a new/alternative take on the database copying
 problem, maybe you could have a look at that?

It's just sad realizing people need to some up with hacks and
work-arounds to solve a obvious real-life problem, easily fixed inside
pg_dump with 102 lines of drop-dead simple code, not touching any of
the logics or flows in pg_dump.

I can't even image how many hours coders have wasted hacking together
tools like pg_extractor just to circumvent the stupid fact pg_dump
can't do this natively.

The pg_extractor is way more complex than my suggested patch, it's 974
lines of perl codes, as opposed to 102 lines of simple code in the
patch.
The pg_extractor also does a lot more than simply splitting objects
into separate files, like executing svn commands.

The splitting of objects into separate files should clearly be the
responsibility of pg_dump.
It would allow you to easily version control the schema files your
self with any version control software system, such as svn, git, etc.

I'm sure pg_extractor does it best to achieve the objective, but even
if it does, I would never trust it for production usage, version
controlling your production schema is far too important to trust any
tool not part of the mainline distribution of postgres. And personally
I don't have any problem, I've been using the --split option for two
years, I just feel sorry for the rest of the postgres community,
unaware of how to solve this problem, having to hack together their
own little tools, or be lucky finding some existing hack.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove readline notice from psql --version?

2012-05-20 Thread Alvaro Herrera

Excerpts from David Fetter's message of dom may 20 15:30:52 -0400 2012:
 On Sun, May 20, 2012 at 01:24:21AM +0300, Peter Eisentraut wrote:
  Currently, psql --version prints something like
  
  psql (PostgreSQL) 9.2beta1
  contains support for command-line editing
 
 I think this should be replaced with a notice about the actual library
 used.

That was my thought as well, but is it possible to implement it?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove readline notice from psql --version?

2012-05-20 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from David Fetter's message of dom may 20 15:30:52 -0400 2012:
 I think this should be replaced with a notice about the actual library
 used.

 That was my thought as well, but is it possible to implement it?

And, more to the point, would it be more reliable than checking the
results of system-specific tools such as ldd?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-05-20 Thread Daniel Farina
On Sun, May 20, 2012 at 7:36 PM, Joel Jacobson j...@trustly.com wrote:
 On Mon, May 21, 2012 at 8:08 AM, Daniel Farina dan...@heroku.com wrote:
 I think you are absolutely right, but I'm not sure if teaching pg_dump
 a new option is the best idea.  It's a pretty complex program as-is.
 I've also heard some people who really wish pg knew how to self-dump
 for valid reasons.

 Complex program? Yes, pg_dump it is extremely complex, I wouldn't want
 to touch any of the code. A rewrite is probably close to impossible.

I wouldn't be so sure about that...

 Complex patch? No. It's 102 lines of code and doesn't change any of
 the existing code in pg_dump, it simply adds some lines writing out
 the objects to separate files. Have a look at the patch, it's super
 simple.

Ah. I did not know there was a patch already out there -- I did not
somehow get that , as it then can be audited in its precise functionality.

 It sounds like some of the catalog wrangling and cycle-breaking
 properties of pg_dump could benefit from being exposed stand-alone,
 but unfortunately that's not a simple task, especially if you want to
 do The Right Thing and have pg_dump link that code, given pg_dump's
 criticality.

 It's just sad realizing people need to some up with hacks and
 work-arounds to solve a obvious real-life problem, easily fixed inside
 pg_dump with 102 lines of drop-dead simple code, not touching any of
 the logics or flows in pg_dump.

 I can't even image how many hours coders have wasted hacking together
 tools like pg_extractor just to circumvent the stupid fact pg_dump
 can't do this natively.

My next question would be how this might relate to the directory dump
format.  For example, is it an embellishment of that?  It seems at
fist glance that whatever this patch might be a cousin of that
feature.  Or, is it superseded? The documentation is clear that tables
are given their own files, but doesn't say much about how other schema
objects are stored, so they may or may not be useful to your needs.

Also, now that I look more carefully, there was a lot of conversation
about this patch; it seems like what you are doing now is reporting
its successful use, and I did not understand that by reading the
abstract of your email.  And, beyond that, do we have a summary of the
open questions that prevented it from being committed?

 I'm sure pg_extractor does it best to achieve the objective, but even
 if it does, I would never trust it for production usage, version
 controlling your production schema is far too important to trust any
 tool not part of the mainline distribution of postgres. And personally
 I don't have any problem, I've been using the --split option for two
 years, I just feel sorry for the rest of the postgres community,
 unaware of how to solve this problem, having to hack together their
 own little tools, or be lucky finding some existing hack.

My thinking is that confidence would be increased if there was a piece
of code that handled a lot of the catalog munging et al that is part
of pg_dump that *is* maintained by postgres so other projects can more
convincingly add a correct veneer.

As a meta-comment, all I did was ask some polite questions.  You could
have politely disqualified pg_extractor and spared some of the
language without having gotten anything less done.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove readline notice from psql --version?

2012-05-20 Thread Alvaro Herrera

Excerpts from Tom Lane's message of dom may 20 23:04:59 -0400 2012:
 
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Excerpts from David Fetter's message of dom may 20 15:30:52 -0400 2012:
  I think this should be replaced with a notice about the actual library
  used.
 
  That was my thought as well, but is it possible to implement it?
 
 And, more to the point, would it be more reliable than checking the
 results of system-specific tools such as ldd?

If well implemented, my guess is that it would be.  For example recall
that in Debian they are shipping psql linked to libedit due to licensing
concerns, but then suggest a hack to use LD_PRELOAD to load libreadline
instead.  So ldd might tell you that it's linked to libedit and fail to
notice that at runtime something different is being used.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove readline notice from psql --version?

2012-05-20 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Tom Lane's message of dom may 20 23:04:59 -0400 2012:
 And, more to the point, would it be more reliable than checking the
 results of system-specific tools such as ldd?

 If well implemented, my guess is that it would be.  For example recall
 that in Debian they are shipping psql linked to libedit due to licensing
 concerns, but then suggest a hack to use LD_PRELOAD to load libreadline
 instead.  So ldd might tell you that it's linked to libedit and fail to
 notice that at runtime something different is being used.

[ raised eyebrow ... ]  Yeah, but exactly what would it take to produce
an output that told the truth in such a situation?  I'll bet a large
amount of money that you would need a separate implementation for every
platform.  And this problem just is not worth that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-05-20 Thread Joel Jacobson
On Mon, May 21, 2012 at 10:06 AM, Daniel Farina dan...@heroku.com wrote:
 Also, now that I look more carefully, there was a lot of conversation
 about this patch; it seems like what you are doing now is reporting
 its successful use, and I did not understand that by reading the
 abstract of your email.  And, beyond that, do we have a summary of the
 open questions that prevented it from being committed?

Good idea. Here is an attempt to a summary:

http://archives.postgresql.org/pgsql-hackers/2010-12/msg02301.php
The initial feedback was on the usage of OIDs as file names.
This was indeed a bad idea and was changed, see
http://archives.postgresql.org/pgsql-hackers/2010-12/msg02314.php
Tom Lane also pointed out it doesn't solve the randomly different
ordering of rows within a table-problem.
The rows within a table are not part of the schema. The patch doesn't
attempt to solve that problem.

http://archives.postgresql.org/pgsql-hackers/2010-12/msg02318.php
Gurjeet Singh pointed out the problem with functions sharing the same
name but having different arguments.
As of now, it's not certain they will always be dumped into the same files.
This is a valid point, and needs to be solved in an elegant way.
The arguments needs to be made part of the path somehow.

http://archives.postgresql.org/pgsql-hackers/2010-12/msg02327.php
Another idea Gurjeet had was to hash the object identifier and use
that in the file's name.
Not a bad idea, would look nicer if functions have very many arguments.
Perhaps /[schema]/[object type]/[name]/[hash of arguments].sql

http://archives.postgresql.org/pgsql-hackers/2010-12/msg02329.php
David Wilson suggested placing all overloaded functions within the same file.
Then, assuming you order them deterministically within that file, we
sidestep the
file naming issue and maintain useful diff capabilities, since a diff of the
function's file will show additions or removals of various overloaded
versions.
This would be a good idea, but falls on pg_dump not outputting the
functions in a deterministic order.

http://archives.postgresql.org/pgsql-hackers/2010-12/msg02496.php
Robert Treat: I've both enjoyed reading this thread and seeing this wheel
reinvented yet again, and wholeheartedly +1 the idea of building this
directly into pg_dump. (The only thing better would be to make everything
thing sql callable, but that's a problem for another day).

 My thinking is that confidence would be increased if there was a piece
 of code that handled a lot of the catalog munging et al that is part
 of pg_dump that *is* maintained by postgres so other projects can more
 convincingly add a correct veneer.

I totally agree, the most easy place to handle it is within pg_dump.

 As a meta-comment, all I did was ask some polite questions.  You could
 have politely disqualified pg_extractor and spared some of the
 language without having gotten anything less done.

I very much appreciated your answer, and I did in no way mean to be impolite.

Best regards,

Joel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] External Open Standards

2012-05-20 Thread Brendan Jurd
On 20 May 2012 01:52, Daniel Farina dan...@heroku.com wrote:
 The documentation is misleading to the point of our support for ISO
 8601-strict parsing.

 http://archives.postgresql.org/pgsql-hackers/2012-02/msg01237.php

 A very fine point, but I discovered it not out of curiosity, but a
 fairly angry user on Twitter.

 We can define the problem away since the space-inclusive format is so
 common...so much so, that it is codified in RFC 3339
 (http://www.ietf.org/rfc/rfc3339.txt).  The only problem, then, is the
 DATESTYLE ISO labeling: changing that would be really painful, so
 perhaps another solution is to parse the T demanded by 8601,
 presuming no other details come to light.

We may be wandering a bit off-topic from Simon's OP, but I'll bite.
We already do *parse* the 'T' in datetime input:

postgres=# select timestamp '2012-05-21T15:05';
  timestamp
-
 2012-05-21 15:05:00
(1 row)

What we don't do is *output* the 'T', but this is pretty easy to
workaround, e.g., to_char(now(), '-MM-DDTHH24:MI:SS').  The
scope of  actually wanting the 'T' is surely pretty minor?

I'd be okay with just adding a note in the manual under Date/Time
Output to the effect of Note: ISO 8601 specifies the use of uppercase
letter 'T' to separate the date and time. Postgres uses a space for
improved readability, in line with other database systems and RFC
3339.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] External Open Standards

2012-05-20 Thread Daniel Farina
On Sun, May 20, 2012 at 10:34 PM, Brendan Jurd dire...@gmail.com wrote:
 What we don't do is *output* the 'T', but this is pretty easy to
 workaround, e.g., to_char(now(), '-MM-DDTHH24:MI:SS').  The
 scope of  actually wanting the 'T' is surely pretty minor?

 I'd be okay with just adding a note in the manual under Date/Time
 Output to the effect of Note: ISO 8601 specifies the use of uppercase
 letter 'T' to separate the date and time. Postgres uses a space for
 improved readability, in line with other database systems and RFC
 3339.

Yes, I believe you are quite right.  My mistake, and I like your
suggestion, too, as it has successfully informed me of the most
accurate rendition of this.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers