Re: [HACKERS] 9.2.3 crashes during archive recovery

2013-02-16 Thread Ants Aasma
On Fri, Feb 15, 2013 at 3:49 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 While this solution would help solve my issue, it assumes that the
 correct amount of WAL files are actually there. Currently the docs for
 setting up a standby refer to 24.3.4. Recovering Using a Continuous
 Archive Backup, and that step recommends emptying the contents of
 pg_xlog. If this is chosen as the solution the docs should be adjusted
 to recommend using pg_basebackup -x for setting up the standby.

 When the backup is taken using pg_start_backup or pg_basebackup,
 minRecoveryPoint is set correctly anyway, and it's OK to clear out pg_xlog.

How is minRecoveryPoint supposed to get set? I just tried taking a
pg_basebackup on master running pgbench. The resulting data dir
controlfile had this:

Min recovery ending location: 0/0

The end location was not in the backup_label either.

Looking at basebackup.c the process seems to be:
1. pg_start_backup
2. copy out backup_label
3. copy out rest of the datadir
4. copy out control file
5. pg_stop_backup
6. copy out WAL
7. send backup stop location

And pg_basebackup.c only uses the stop location to decide how much WAL to fetch.

I don't see anything here that could correctly communicate min
recovery point. Maybe I'm missing something.

 Yeah, it probably could use some editing, as the underlying code has evolved
 a lot since it was written. The suggestion to clear out pg_xlog seems like
 an unnecessary complication. It's safe to do so, if you restore with an
 archive, but unnecessary.

 The File System Level Backup chapter
 (http://www.postgresql.org/docs/devel/static/backup-file.html) probably
 should mention pg_basebackup -x, too.

 Docs patches are welcome..

I will give it a shot.

Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system

2013-02-16 Thread Tomas Vondra
On 15.2.2013 01:02, Tomas Vondra wrote:
 On 14.2.2013 22:24, Alvaro Herrera wrote:
 Alvaro Herrera escribió:
 Here's a ninth version of this patch.  (version 8 went unpublished).  I
 have simplified a lot of things and improved some comments; I think I
 understand much of it now.  I think this patch is fairly close to
 committable, but one issue remains, which is this bit in
 pgstat_write_statsfiles():

 I've marked this as Waiting on author for the time being.  I'm going to
 review/work on other patches now, hoping that Tomas will post an updated
 version in time for it to be considered for 9.3.
 
 Sadly I have no idea how to fix that, and I think the solution you
 suggested in the previous messages does not actually do the trick :-(


I've been thinking about this (actually I had a really weird dream about
it this night) and I think it might work like this:

(1) check the timestamp of the global file - if it's too old, we need
to send an inquiry or wait a bit longer

(2) if it's new enough, we need to read it a look for that particular
database - if it's not found, we have no info about it yet (this is
the case handled by the dummy files)

(3) if there's a database stat entry, we need to check the timestamp
when it was written for the last time - if it's too old, send an
inquiry and wait a bit longer

(4) well, we have a recent global file, it contains the database stat
entry and it's fresh enough - tadaa, we're done

At least that's my idea - I haven't tried to implement it yet.


I see a few pros and cons of this approach:

pros:

  * no dummy files
  * no timestamps in the per-db files (and thus no sync issues)

cons:

  * the backends / workers will have to re-read the global file just to
check that the per-db file is there and is fresh enough


So far it was sufficient just to peek at the timestamp at the beginning
of the per-db stat file - minimum data read, no CPU-expensive processing
etc. Sadly the more DBs there are, the larger the file get (thus more
overhead to read it).

OTOH it's not that much data (~180B per entry, so with a 1000 of dbs
it's just ~180kB) so I don't expect this to be a tremendous issue. And
the pros seem to be quite compelling.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [RFC] indirect toast tuple support

2013-02-16 Thread Andres Freund
Hi,

During logical decoding toast tuples are decoded separately from the
main tuple. Works nicely. To make the main table's HeapTuple actually
easily useable the tuple needs to be reconstructed to not point to
disk anymore but to the separately reconstructed tuples.

There are two ways to do this:
a) build a new HeapTuple that contains all formerly toasted datums
   inline (i.e. !VARATT_IS_EXTERNAL)
b) add support for external toast tuples that point into separately
   allocated memory instead of a toast table

a) has the problem that that the flattened HeapTuple can be bigger than
our maximal allocation size which seems like an awkward restriction...

Given that there have been wishes to support something like b) for quite
some time, independent from logical decoding, it seems like a good idea
to add support for it. Its e.g. useful for avoiding repeated detoasting
or decompression of tuples.

The problem with b) is that there is no space in varlena's flag bits to
directly denote that a varlena points into memory instead of either
directly containing the data or a varattrib_1b_e containing a
varatt_external pointing to an on-disk toasted tuple.

I propose extending the EXTERNAL varlenas to be able to point to memory
instead just to disk. It seem apt to use EXTERNAL for this as they
aren't stored in the normal heap tuple but somewhere else.
Unfortunately there is no backward-compatible flag space in
varattrib_1b_e either to nicely denote this and we sure don't want to
break on-disk compatibility for this. Thus I propose to distinguish
on-disk and in-memory tuples via the varattrib_1b_e.va_len_1be.

The attached (RFC, not fully ready!) patch adds the following stuff to
the public interfaces:
typedef struct varatt_indirect
{
struct varlena *pointer;/* Pointer to in-memory varlena */
} varatt_indirect;

...

#define VARATT_IS_EXTERNAL(PTR) VARATT_IS_1B_E(PTR)
#define VARATT_IS_EXTERNAL_TOAST(PTR) \
(VARATT_IS_EXTERNAL(PTR)  VARSIZE_EXTERNAL(PTR) == TOAST_POINTER_SIZE)
#define VARATT_IS_EXTERNAL_INDIRECT(PTR) \
(VARATT_IS_EXTERNAL(PTR)  VARSIZE_EXTERNAL(PTR) == 
INDIRECT_POINTER_SIZE)

I don't like to make the distinction through the size but I don't have a
better idea. And hey, its toast/varlena stuff, who expects cleanliness ;)

Existing code doesn't need to care whether a EXTERNAL datum is TOAST or
INDIRECT, that's handled transparently in tuptoaster.c. All EXTERNAL
tuples need to go through there anyway, so that seems fine.

Currently toast_fetch_datum() in tuptoaster.c does part of the gruntwork
for this as it was the easiest location but I think it might be better
to spread the work to some more callsites of it for clarity's sake.

Opinions?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index 49f1553..613040a 100644
--- a/src/backend/access/heap/tuptoaster.c
+++ b/src/backend/access/heap/tuptoaster.c
@@ -44,9 +44,6 @@
 
 #undef TOAST_DEBUG
 
-/* Size of an EXTERNAL datum that contains a standard TOAST pointer */
-#define TOAST_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(struct varatt_external))
-
 /*
  * Testing whether an externally-stored value is compressed now requires
  * comparing extsize (the actual length of the external data) to rawsize
@@ -191,7 +188,7 @@ heap_tuple_untoast_attr_slice(struct varlena * attr,
 	char	   *attrdata;
 	int32		attrsize;
 
-	if (VARATT_IS_EXTERNAL(attr))
+	if (VARATT_IS_EXTERNAL_TOAST(attr))
 	{
 		struct varatt_external toast_pointer;
 
@@ -204,6 +201,13 @@ heap_tuple_untoast_attr_slice(struct varlena * attr,
 		/* fetch it back (compressed marker will get set automatically) */
 		preslice = toast_fetch_datum(attr);
 	}
+	else if (VARATT_IS_EXTERNAL_INDIRECT(attr))
+	{
+		struct varatt_indirect redirect;
+		VARATT_EXTERNAL_GET_POINTER(redirect, attr);
+		return heap_tuple_untoast_attr_slice(redirect.pointer,
+			 sliceoffset, slicelength);
+	}
 	else
 		preslice = attr;
 
@@ -267,7 +271,7 @@ toast_raw_datum_size(Datum value)
 	struct varlena *attr = (struct varlena *) DatumGetPointer(value);
 	Size		result;
 
-	if (VARATT_IS_EXTERNAL(attr))
+	if (VARATT_IS_EXTERNAL_TOAST(attr))
 	{
 		/* va_rawsize is the size of the original datum -- including header */
 		struct varatt_external toast_pointer;
@@ -275,6 +279,13 @@ toast_raw_datum_size(Datum value)
 		VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
 		result = toast_pointer.va_rawsize;
 	}
+	else if (VARATT_IS_EXTERNAL_INDIRECT(attr))
+	{
+		struct varatt_indirect toast_pointer;
+
+		VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
+		return toast_raw_datum_size(PointerGetDatum(toast_pointer.pointer));
+	}
 	else if (VARATT_IS_COMPRESSED(attr))
 	{
 		/* here, va_rawsize is just the payload size */
@@ -308,7 +319,7 @@ toast_datum_size(Datum value)
 	struct 

Re: [HACKERS] JSON Function Bike Shedding

2013-02-16 Thread Andrew Dunstan


On 02/13/2013 11:36 AM, Andrew Dunstan wrote:



Therefore, I would like to propose different names:

Existing Name  Proposed Name
-- 
json_array_length() array_length() or length() or size()
json_each() each_json()
json_each_as_text() each_text()
json_get()  get_json()
json_get_as_text()  get_text()
json_get_path() get_json()
json_get_path_as_text() get_text()
json_object_keys()  get_keys()
json_populate_record()  record() or row()
json_populate_recordset()   records() or rows()
json_unnest()   get_values()
json_agg()  collect_json()

Note that I have given json_get() and json_get_path() the same names, 
as it seems to me that the former is the same as the latter, with 
only one parameter. Same for json_get_as_text() and 
json_get_path_as_text().


I will take some of this under advisement. Note that 
json_populate_record's name was taken from hstore's populate_record, 
so if we're trying to use similar names then it should possibly be 
just populate_record. Or if that's still a bit long I would accept 
to_record.



I have had a look at doing something like this with the json_get 
functions. The trouble is that the best way to do it is to have json_get 
take variadic any, but then string literals come in as unknown rather 
than as text, which makes things fairly ugly. If we force people to cast 
path elements to text then I think the cure is worse than the disease. I 
think the best we can do here is possibly to provide json_get and 
json_get_text taking either a single int or variadic text[], and 
json_get_path and json_get_path_text taking non-variadic text[].



cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] indirect toast tuple support

2013-02-16 Thread Andres Freund
On 2013-02-16 17:42:31 +0100, Andres Freund wrote:
 +/* --
 + * toast_datum_differs -
 + *
 + *  Determine whether two toasted datums are the same and don't have to be
 + *  stored again.
 + * --
 + */
 +static bool
 +toast_datum_differs(struct varlena *old_value, struct varlena *new_value)
 +{
 + Assert(VARATT_IS_EXTERNAL(old_value));
 + Assert(VARATT_IS_EXTERNAL(new_value));
 +
 + /* fast path for the common case where we have the toast oid available 
 */
 + if (VARATT_IS_EXTERNAL_TOAST(old_value) 
 + VARATT_IS_EXTERNAL_TOAST(new_value))
 + return memcmp((char *) old_value, (char *) new_value,
 +   VARSIZE_EXTERNAL(old_value)) == 0;
 ...
 + /* compare payload, we're fine with unaligned data */
 + return memcmp(VARDATA_ANY(old_value), VARDATA_ANY(new_value),
 +   VARSIZE_ANY_EXHDR(old_value)) == 0;
 +}

Those == need to be !=. Comes from changing the meaning of a function
last minute without an easy way to test (it just uselessly emits a new
toast tuple when nothing changed).

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-16 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 On Fri, Feb 15, 2013 at 8:01 PM, Kevin Grittner kgri...@ymail.com wrote:

 There is one odd aspect to pg_dump, but I think the way it is
 behaving is the best way to handle it, although I invite other
 opinions.  If you load from pg_dump output, it will try to
 populated materialized views which were populated on dump, and
 leave the ones which were not scannable because the contents had
 not been generated in an empty and unscannable state on restore.
 That much seems pretty obvious.  Where it gets  a little tricky is
 if mva is generated with data, and mvb is generated based on mva.
 Then mva is truncated.  Then you dump.  mvb was populated at the
 time of the dump, but its contents can't be regenerated on restore
 because mva is not scannable.  As the patch currently stands, you
 get an error on the attempt to REFRESH mvb.  I think that's a good
 thing, but I'm open to arguments to the contrary.

 Hmm, anything that means a dump-and-restore can fail seems like a bad
 thing to me.  There's nothing outrageous about that scenario.  It's
 arguable what state dump-and-restore should leave the new database in,
 but I don't see why it shouldn't work.  I predict we'll end up with
 unhappy users if we leave it like this.

Keeping in mind that mva may take hours to refresh, and mvb may
take only minutes once you have the data from mva, what behavior do
you think is preferable?

The alternatives I can think of are:

(1)  Force mva to refresh on restore, even though it was empty and
not scannable on dump.  This may delay completion of the restore
for an extended time.  It would leave both mva and mvb populated.

(2)  Populate mvb by using mva's query as a regular view.  This
would leave things in the same state as they were on dump, and
might possibly optimized to something faster than generating mva
and then mvb; but probably would not be much faster in most cases.

(3)  Change the failure to generate data for mvb in this case as a
WARNING rather than an ERROR.

(4)  Actually dump and restore data with COPY statements for
materialized views, rather than having the dump create REFRESH
statements.  The main down side of this, it seems to me, is that it
opens up materialized views to direct tinkering of contents via SQL
statements, which I was hoping to avoid.  Perhaps this can be
mitigated in some way.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] is it bug? - printing boolean domains in sql/xml function

2013-02-16 Thread Noah Misch
On Sun, Jan 13, 2013 at 07:54:11AM +0100, Pavel Stehule wrote:
 related to 
 http://www.postgresql.org/message-id/cafj8prdtavfnrazwet+ewmfrbdzffva8w17kk_e12fb6t-z...@mail.gmail.com
 
 boolean domains is serialised to string different than boolean
 
 postgres=# CREATE DOMAIN booldomain as bool;
 CREATE DOMAIN
 
 -- fully expected behave
 postgres=# select true, true::booldomain;
  bool | booldomain
 --+
  t| t
 (1 row)
 
 postgres=# select true::text, true::booldomain::text;
  text | text
 --+--
  true | true
 (1 row)
 
 -- unexpected behave
 postgres=# select xmlforest(true as bool, true::booldomain as booldomain);
   xmlforest
 -
  booltrue/boolbooldomaint/booldomain
 (1 row)
 
 is it expected behave?

There is a bug here.  map_sql_type_to_xmlschema_type() has special treatment
for domains, but map_sql_value_to_xml_value() and its callers have no
corresponding logic.  In the same vein, this yields a schema that does not
validate its corresponding document:

set datestyle = 'sql, dmy';
create domain datedom as date;
create table t as select current_date AS a, current_date::datedom AS b;
select table_to_xml('t', true, true, '');
select table_to_xmlschema('t', true, true, '');

One could debate whether the schema generation or the data generation should
be the one to change, but I tentatively vote for the latter.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-16 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote:
 On Fri, Feb 15, 2013 at 08:24:16PM -0500, Robert Haas wrote:
 On Fri, Feb 15, 2013 at 8:01 PM, Kevin Grittner kgri...@ymail.com wrote:
 There is one odd aspect to pg_dump, but I think the way it is
 behaving is the best way to handle it, although I invite other
 opinions.  If you load from pg_dump output, it will try to
 populated materialized views which were populated on dump, and
 leave the ones which were not scannable because the contents had
 not been generated in an empty and unscannable state on restore.
 That much seems pretty obvious.  Where it gets  a little tricky is
 if mva is generated with data, and mvb is generated based on mva.
 Then mva is truncated.  Then you dump.  mvb was populated at the
 time of the dump, but its contents can't be regenerated on restore
 because mva is not scannable.  As the patch currently stands, you
 get an error on the attempt to REFRESH mvb.  I think that's a good
 thing, but I'm open to arguments to the contrary.

 Hmm, anything that means a dump-and-restore can fail seems like a bad
 thing to me.  There's nothing outrageous about that scenario.  It's
 arguable what state dump-and-restore should leave the new database in,
 but I don't see why it shouldn't work.  I predict we'll end up with
 unhappy users if we leave it like this.

 pg_upgrade is going to fail on that pg_restore error.  :-(

With the hard link option it should succeed, I would think.  If we
arranged for the check option, when run without the hard link
option, to report such cases so that the user could choose to
either truncate mvb or refresh mva before the upgrade, would that
satisfy this concern?

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] is it bug? - printing boolean domains in sql/xml function

2013-02-16 Thread Pavel Stehule
2013/2/16 Noah Misch n...@leadboat.com:
 On Sun, Jan 13, 2013 at 07:54:11AM +0100, Pavel Stehule wrote:
 related to 
 http://www.postgresql.org/message-id/cafj8prdtavfnrazwet+ewmfrbdzffva8w17kk_e12fb6t-z...@mail.gmail.com

 boolean domains is serialised to string different than boolean

 postgres=# CREATE DOMAIN booldomain as bool;
 CREATE DOMAIN

 -- fully expected behave
 postgres=# select true, true::booldomain;
  bool | booldomain
 --+
  t| t
 (1 row)

 postgres=# select true::text, true::booldomain::text;
  text | text
 --+--
  true | true
 (1 row)

 -- unexpected behave
 postgres=# select xmlforest(true as bool, true::booldomain as booldomain);
   xmlforest
 -
  booltrue/boolbooldomaint/booldomain
 (1 row)

 is it expected behave?

 There is a bug here.  map_sql_type_to_xmlschema_type() has special treatment
 for domains, but map_sql_value_to_xml_value() and its callers have no
 corresponding logic.  In the same vein, this yields a schema that does not
 validate its corresponding document:

 set datestyle = 'sql, dmy';
 create domain datedom as date;
 create table t as select current_date AS a, current_date::datedom AS b;
 select table_to_xml('t', true, true, '');
 select table_to_xmlschema('t', true, true, '');

 One could debate whether the schema generation or the data generation should
 be the one to change, but I tentatively vote for the latter.

yes, I am thinking so it is bug too.

if we will agree so it should be fixed I'll write fix

Regards

Pavel




 Thanks,
 nm

 --
 Noah Misch
 EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-16 Thread Noah Misch
On Sat, Feb 16, 2013 at 09:53:14AM -0800, Kevin Grittner wrote:
 Robert Haas robertmh...@gmail.com wrote:
  On Fri, Feb 15, 2013 at 8:01 PM, Kevin Grittner kgri...@ymail.com wrote:
  There is one odd aspect to pg_dump, but I think the way it is
  behaving is the best way to handle it, although I invite other
  opinions.  If you load from pg_dump output, it will try to
  populated materialized views which were populated on dump, and
  leave the ones which were not scannable because the contents had
  not been generated in an empty and unscannable state on restore.
  That much seems pretty obvious.  Where it gets  a little tricky is
  if mva is generated with data, and mvb is generated based on mva.
  Then mva is truncated.  Then you dump.  mvb was populated at the
  time of the dump, but its contents can't be regenerated on restore
  because mva is not scannable.  As the patch currently stands, you
  get an error on the attempt to REFRESH mvb.  I think that's a good
  thing, but I'm open to arguments to the contrary.
 
  Hmm, anything that means a dump-and-restore can fail seems like a bad
  thing to me.  There's nothing outrageous about that scenario.  It's
  arguable what state dump-and-restore should leave the new database in,
  but I don't see why it shouldn't work.  I predict we'll end up with
  unhappy users if we leave it like this.

I agree that making the dump fail on this account is bad.

 Keeping in mind that mva may take hours to refresh, and mvb may
 take only minutes once you have the data from mva, what behavior do
 you think is preferable?
 
 The alternatives I can think of are:
 
 (1)  Force mva to refresh on restore, even though it was empty and
 not scannable on dump.  This may delay completion of the restore
 for an extended time.  It would leave both mva and mvb populated.

This is reasonable.  If the user doesn't like it, he can presumably use an
edited dump list to remove the REFRESHes.

 (2)  Populate mvb by using mva's query as a regular view.  This
 would leave things in the same state as they were on dump, and
 might possibly optimized to something faster than generating mva
 and then mvb; but probably would not be much faster in most cases.

Interesting idea, but I don't think adding novel server behavior is justified
to achieve this.

 (3)  Change the failure to generate data for mvb in this case as a
 WARNING rather than an ERROR.

This is also fair.  However, I think it's better to restore more valid MVs
(option 1) than fewer.

 (4)  Actually dump and restore data with COPY statements for
 materialized views, rather than having the dump create REFRESH
 statements.  The main down side of this, it seems to me, is that it
 opens up materialized views to direct tinkering of contents via SQL
 statements, which I was hoping to avoid.  Perhaps this can be
 mitigated in some way.

This is a door better left closed.


Overall, I recommend option 1.  None of the options will furnish the desire of
every database, but the DBA can always tailor the outcome by replacing the
dumped REFRESH statements with his own.  I'm not envisioning that MVs left
invalid for the long term will be a typical thing, anyway.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON Function Bike Shedding

2013-02-16 Thread David E. Wheeler
On Feb 16, 2013, at 8:57 AM, Andrew Dunstan and...@dunslane.net wrote:

 I have had a look at doing something like this with the json_get functions. 
 The trouble is that the best way to do it is to have json_get take variadic 
 any, but then string literals come in as unknown rather than as text, which 
 makes things fairly ugly. If we force people to cast path elements to text 
 then I think the cure is worse than the disease. I think the best we can do 
 here is possibly to provide json_get and json_get_text taking either a single 
 int or variadic text[], and json_get_path and json_get_path_text taking 
 non-variadic text[].

Why not also one taking a single text?

get(text)
get(int)
get(variadic text[])

?

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON Function Bike Shedding

2013-02-16 Thread Andres Freund
On 2013-02-16 11:55:26 -0800, David E. Wheeler wrote:
 On Feb 16, 2013, at 8:57 AM, Andrew Dunstan and...@dunslane.net wrote:
 
  I have had a look at doing something like this with the json_get functions. 
  The trouble is that the best way to do it is to have json_get take 
  variadic any, but then string literals come in as unknown rather than as 
  text, which makes things fairly ugly. If we force people to cast path 
  elements to text then I think the cure is worse than the disease. I think 
  the best we can do here is possibly to provide json_get and json_get_text 
  taking either a single int or variadic text[], and json_get_path and 
  json_get_path_text taking non-variadic text[].
 
 Why not also one taking a single text?
 
 get(text)
 get(int)
 get(variadic text[])

Those aren't differentiable by their argument types. Why should json be
able to claim that namespace and not other datatypes?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON Function Bike Shedding

2013-02-16 Thread Andrew Dunstan


On 02/16/2013 03:05 PM, Andres Freund wrote:

On 2013-02-16 11:55:26 -0800, David E. Wheeler wrote:

On Feb 16, 2013, at 8:57 AM, Andrew Dunstan and...@dunslane.net wrote:


I have had a look at doing something like this with the json_get functions. The trouble 
is that the best way to do it is to have json_get take variadic any, but then 
string literals come in as unknown rather than as text, which makes things fairly ugly. 
If we force people to cast path elements to text then I think the cure is worse than the 
disease. I think the best we can do here is possibly to provide json_get and 
json_get_text taking either a single int or variadic text[], and json_get_path and 
json_get_path_text taking non-variadic text[].

Why not also one taking a single text?

 get(text)
 get(int)
 get(variadic text[])

Those aren't differentiable by their argument types. Why should json be
able to claim that namespace and not other datatypes?




Well, of course the calls would  be

get(json, ...)

although I'm still waiting to see if anyone else agrees with Robert 
about the naming of the functions.


To answer David's point, there is no point in having both

get(json,text)
get(json, variadic text[])


since the second can encompass the first, and having both would make 
calls ambiguous.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-16 Thread Tom Lane
Continuing to look at this patch ... I'm wondering if any particular
discussion went into choosing the FDW option names nspname, relname,
and colname.  These don't seem to me like names that we ought to be
exposing at the SQL command level.  Why not just schema, table,
column?  Or perhaps schema_name, table_name, column_name if you
feel it's essential to distinguish that these are names.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] src/ports/pgcheckdir.c - Ignore dot directories...

2013-02-16 Thread Bruce Momjian
On Fri, Feb 15, 2013 at 12:12:03PM -0500, Bruce Momjian wrote:
 On Thu, Feb 14, 2013 at 07:21:27PM -0500, Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   Agreed.  The attached patch modifies pg_check_dir() to report about
   invisible and lost+found directory entries, and give more helpful
   messages to the user.
  
  I'm not terribly thrilled with special-casing 'lost+found' like that,
  since it's an extremely filesystem-dependent thing that even today
  probably only applies to a minority of our installed platforms.
  
  The special case for dotfiles might be useful, not because of any
  connection to mount points but just because someone might forget
  that such could be lurking in a directory that looks empty.
 
 I was ready to give up on this patch, but then I thought, what
 percentage does lost+found and dot-file-only directories cover for mount
 points?  What other cases are there?
 
 This updated version of the patch reports about dot files if they are
 the _only_ files in the directory, and it suggests a top-level mount
 point might be the cause.
 
 Does this help?

Applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON Function Bike Shedding

2013-02-16 Thread David E. Wheeler
On Feb 16, 2013, at 12:47 PM, Andrew Dunstan and...@dunslane.net wrote:

 To answer David's point, there is no point in having both
 
get(json,text)
get(json, variadic text[])
 
 since the second can encompass the first, and having both would make calls 
 ambiguous.

Oh. Well then how about

   get(json, int)
   get(json, text)
   get(json, text[])

?

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2013-02-16 Thread Shigeru Hanada
On Sun, Feb 17, 2013 at 8:44 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Continuing to look at this patch ... I'm wondering if any particular
 discussion went into choosing the FDW option names nspname, relname,
 and colname.

IIRC, there was no deep discussion about those option names.  I simply
chose relname and nspname from pg_class and pg_namespace.  At that
time I thought users would understand those options easily if those
names are used in catalog.

 These don't seem to me like names that we ought to be
 exposing at the SQL command level.  Why not just schema, table,
 column?  Or perhaps schema_name, table_name, column_name if you
 feel it's essential to distinguish that these are names.

I think not-shortened names (words used in documents of conversations)
are better now.  I prefer table_name to table, because it would be
easy to distinguish  as name, even if we add new options like
table_foo.

Besides, I found a strange(?) behavior in psql \d+ command in
no-postfix case, though it wouldn't be a serious problem.

In psql \d+ result for postgres_fdw foreign tables, table and
column are quoted, but schema is not.  Is this behavior of
quote_ident() intentional?

postgres=# \d+ pgbench1_branches
 Foreign table
public.pgbench1_branches
  Column  | Type  | Modifiers |   FDW Options| Storage  |
Stats target | Description
--+---+---+--+--+--+-
 bid  | integer   | not null  | (column 'bid') | plain|
|
 bbalance | integer   |   |  | plain|
|
 filler   | character(88) |   |  | extended |
|
Server: pgbench1
FDW Options: (schema 'public', table 'foo')
Has OIDs: no

We can use table and column options without quoting (or with quote
of course) in CREATE/ALTER FOREIGN TABLE commands, so this is not a
barrier against choosing no-postfix names.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system

2013-02-16 Thread Alvaro Herrera
Tomas Vondra wrote:

 I've been thinking about this (actually I had a really weird dream about
 it this night) and I think it might work like this:
 
 (1) check the timestamp of the global file - if it's too old, we need
 to send an inquiry or wait a bit longer
 
 (2) if it's new enough, we need to read it a look for that particular
 database - if it's not found, we have no info about it yet (this is
 the case handled by the dummy files)
 
 (3) if there's a database stat entry, we need to check the timestamp
 when it was written for the last time - if it's too old, send an
 inquiry and wait a bit longer
 
 (4) well, we have a recent global file, it contains the database stat
 entry and it's fresh enough - tadaa, we're done

Hmm, yes, I think this is what I was imagining.  I had even considered
that the timestamp would be removed from the per-db file as you suggest
here.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq

2013-02-16 Thread Amit kapila
On Tuesday, February 12, 2013 2:49 AM Heikki Linnakangas wrote:
On 04.02.2013 17:32, Alvaro Herrera wrote:
 Phil Sorber wrote:
 On Mon, Feb 4, 2013 at 10:16 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com  wrote:
 Phil Sorber wrote:
 On Mon, Feb 4, 2013 at 9:13 AM, Alvaro Herreraalvhe...@2ndquadrant.com  
 wrote:


 I think this patch would simplift the patch to pass a connection string
 to pg_basebackup and pg_receivexlog:
 http://www.postgresql.org/message-id/005501cdfa45$6b0eec80$412cc580$@ko...@huawei.com.
 I note that pg_dumpall also has a similar issue as pg_basebackup and
 pg_receivexlog; there's no way to pass a connection string to it either.

I have looked into both patches and my analysis is as below:

In pg_basebackup patch, we have connection string and list of keywords 
(individual options specified by user),
in the current available patch it has combined connection string and list of 
keywords as connection string 
and called PQconnectdb() which takes connection string as input.

Now the patch of Phil Sober provides 2 new API's PQconninfoParseParams(), and 
PQconninfodefaultsMerge(),
using these API's I can think of below way for patch pass a connection string 
to pg_basebackup, ...

1. Call existing function PQconinfoParse() with connection string input by user 
and get PQconninfoOption.

2. Now use the existing keywords (individual options specified by user) and 
extract the keywords from 
   PQconninfoOption structure and call new API PQconninfoParseParams() which 
will return PQconninfoOption.
   The PQconninfoOption structure returned in this step will contain all 
keywords

3. Call PQconninfodefaultsMerge() to merge any default values if exist. Not 
sure if this step is required?

4. Extract individual keywords from PQconninfoOption structure and call 
PQconnectdbParams.


Is this inline with what you have in mind or you have thought of some other 
simpler way of using new API's?

With Regards,
Amit Kapila.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers