date:20140306

Hi all,

There are some place with the next commitfest deadlines (2014/06 and
2014/09) ?

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] GSoC on WAL-logging hash indexes

2014-03-06 Thread Heikki Linnakangas


(de-CC'ing pgsql-advocacy)

On 03/06/2014 04:03 AM, Greg Stark wrote:

On Mon, Mar 3, 2014 at 4:12 PM, Robert Haas robertmh...@gmail.com wrote:

Unfortunately, I don't believe that it's possible to do this easily
today because of the way bucket splits are handled.  I wrote about
this previously here, with an idea for solving the problem:


We could just tackle this in the same incomplete, buggy, way that
btrees tackled it for years until Heikki fixed them and the way gin
and gist still do I believe. Namely just emit xlog records for each
page individually and during replay remember when you have an
incomplete split and complain if recovery ends with any still
incomplete. That would be unfortunate to be adding new cases of this
just as Heikki and company are making progress eliminating the ones we
already had but that's surely better than having no recovery at all.


Grmph. Indeed.

Looking at Robert's idea from November:


I have thought about doing some work on this, although I don't know
when I'll ever have the time.  As far as I can see, the basic problem
is that we need a better way of splitting buckets.  Right now,
splitting a bucket means locking it, scanning the entire bucket chain
and moving ~1/2 the tuples to the new bucket, and then unlocking it.
Since this is a long operation, we have to use heavyweight locks to
protect the buckets, which is bad for concurrency.  Since it involves
a modification to an arbitrarily large number of pages that has to be
atomic, it's not possible to WAL-log it sensibly  -- and in fact, I
think this is really the major obstacle to being able to implementing
WAL-logging for hash indexes.


I don't think it's necessary to improve concurrency just to get 
WAL-logging. Better concurrency is a worthy goal of its own, of course, 
but it's a separate concern.


For crash safety, the key is to make sure you perform the whole 
operation in small atomic steps, and you have a way of knowing where to 
continue after crash (this is the same whether you do the cleanup 
immediately at the end of recovery, which I want to avoid, or lazily 
afterwards). But you can hold locks across those small atomic steps, to 
ensure concurrency-safety.



I think we could work around this problem as follows.   Suppose we
have buckets 1..N and the split will create bucket N+1.  We need a
flag that can be set and cleared on the metapage, call it
split-in-progress, and a flag that can optionally be set on particular
index tuples, call it moved-by-split.  We will allow scans of all
buckets and insertions into all buckets while the split is in
progress, but (as now) we will not allow more than one split to be in
progress at the same time.


Hmm, unless I'm reading the code wrong, it *does* allow more than one 
split to be in progress at the same time today.



[description of how to use the moved-by-split to allow scans to run 
concurrently with the split]


I guess that would work, although you didn't actually describe how to 
continue a split after a crash. But it's a lot simpler if you don't also 
try to make it more concurrent:


---
When you start splitting a bucket, first acquire a heavy-weight lock on 
the old and new buckets. Allocate the required number of pages, before 
changing anything on-disk, so that you can easily back out if you run 
out of disk space. So far, this is how splitting works today.


Then you update the metapage to show that the bucket has been split and 
initialize the new bucket's primary page (as one atomic WAL-logged 
operation). Also mark the new bucket's primary page with a 
split-in-progress flag. That's used later by scans to detect if the 
split was interrupted.


Now you scan the old bucket, and move all the tuples belonging to the 
new bucket. That needs to be done as a series of small atomic WAL-logged 
operations, each involving a small number of old and new pages (one WAL 
record for each moved tuple is the simplest, but you can do some 
batching for efficiency). After you're done, clear the split-in-progress 
flag in the new bucket's primary page (WAL-log that), and release the locks.


In a scan, if the bucket you're about to scan has the split-in-progress 
flag set, that indicates that the split was interrupted by a crash. You 
won't see the flag as set if a concurrent split is in progress, because 
you will block on the lock it's holding on the bucket. If you see the 
flag as set, you share-lock and scan both buckets, the old and the new.


If you see the split-in-progress flag in the bucket you're about to 
insert to, you don't need to do anything special. Just insert the tuple 
to the new bucket as normal. Before splitting the new bucket again, 
however, the previous split needs to be finished, or things will get 
complicated. To do that, acquire the locks on the old and the new 
bucket, and scan the old bucket for any remaining tuples that belong to 
the new bucket and move them, and finally clear the split-in-progress flag.

---

This is similar to

Re: [HACKERS] jsonb and nested hstore

On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan p...@heroku.com wrote:
 On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote:
 That's possible to introduce GUC variable for i/o functions which will
 control old bug-to-bug behavior. IMHO, this is much better option that
 stopping hstore development or split hstore to two branches.

 A GUC that controls i/o functions is generally considered to be an
 unacceptable hack.

 In what sense are we really stopping hstore development if hstore2
 lives as jsonb? I have a hard time imagining someone dealing with the
 incompatibility that a user-facing hstore2 would introduce, while
 still preferring hstore syntax over json syntax given the choice.
 There are very rich facilities for manipulating json available in
 every programming language. The same is not true of hstore.

 Having looked at the issue today, I think that the amount of redundant
 code between a hstore2 in core as jsonb and hstore1 will be
 acceptable. The advantages of making a clean-break in having to
 support the legacy hstore disk format strengthen the case for doing so
 too.

Heh, let's not to do an implusive decision about hstore2. I agree,
that jsonb has
a lot of facilities, but don't forget, that json(b) has to follow standard and
in that sense it's more constrained than hstore, which we could further
develop to support some interesting features, which will never be implemented
in json(b).  Also,  it'd be a bit awkward after working on nested
hstore and declaring it
on several conferences (Engine Yard has sponsored part of our hstore
work), suddenly
break people expectation and say, that our work has moved to core to
provide json
some very cool features, good bye, hstore users :(   I'm afraid people
will not understand us.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_ctl status with nonexistent data directory

On Thu, Mar  6, 2014 at 09:54:57AM +0530, Amit Kapila wrote:
  If they haven't passed us a data directory, we don't really know if the
  server is running or not, so the patch just returns '1'.
 
 But for such cases, isn't the status 4 more appropriate?
 As per above link 4 program or service status is unknown
 
 status 1 - 1 program is dead and /var/run pid file exists
 Going by this definition, it seems status 1 means, someone
 has forcefully killed the server and pid file still remains.

Technically, you are right, but I tried a while ago to assign meaningful
values to all the exit locations and the community feedback I got was
that we didn't want that.  I don't see how specifying non-existant or
non-cluster directory would somehow be a case that would be special.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Andrew Dunstan



On 03/06/2014 08:16 AM, Oleg Bartunov wrote:

On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan p...@heroku.com wrote:

On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote:

That's possible to introduce GUC variable for i/o functions which will
control old bug-to-bug behavior. IMHO, this is much better option that
stopping hstore development or split hstore to two branches.

A GUC that controls i/o functions is generally considered to be an
unacceptable hack.

In what sense are we really stopping hstore development if hstore2
lives as jsonb? I have a hard time imagining someone dealing with the
incompatibility that a user-facing hstore2 would introduce, while
still preferring hstore syntax over json syntax given the choice.
There are very rich facilities for manipulating json available in
every programming language. The same is not true of hstore.

Having looked at the issue today, I think that the amount of redundant
code between a hstore2 in core as jsonb and hstore1 will be
acceptable. The advantages of making a clean-break in having to
support the legacy hstore disk format strengthen the case for doing so
too.

Heh, let's not to do an implusive decision about hstore2. I agree,
that jsonb has
a lot of facilities, but don't forget, that json(b) has to follow standard and
in that sense it's more constrained than hstore, which we could further
develop to support some interesting features, which will never be implemented
in json(b).  Also,  it'd be a bit awkward after working on nested
hstore and declaring it
on several conferences (Engine Yard has sponsored part of our hstore
work), suddenly
break people expectation and say, that our work has moved to core to
provide json
some very cool features, good bye, hstore users :(   I'm afraid people
will not understand us.




Oleg,

I hear you, and largely agree, as long as the compatibility issue is 
solved. If it's not, I think inventing a new hstore2 type is probably a 
lousy way to go.


For good or ill, the world has pretty much settled on wanting to use 
json for lightweight treeish data. That's where we'll get the most 
impact. Virtually every programming language (including Perl) has good 
support for json.


I'm not sure what the constraints of json that you might want to break 
are. Perhaps you'd like to specify.


Whatever we do, rest assured your work won't go to waste.

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

On Thu, Mar  6, 2014 at 09:33:18AM -0500, Andrew Dunstan wrote:
 I hear you, and largely agree, as long as the compatibility issue is
 solved. If it's not, I think inventing a new hstore2 type is
 probably a lousy way to go.
 
 For good or ill, the world has pretty much settled on wanting to use
 json for lightweight treeish data. That's where we'll get the most
 impact. Virtually every programming language (including Perl) has
 good support for json.
 
 I'm not sure what the constraints of json that you might want to
 break are. Perhaps you'd like to specify.
 
 Whatever we do, rest assured your work won't go to waste.

OK, just to summarize:

JSONB and everything it shares with hstore will be in core
hstore-specific code stays in contrib
hstore contrib will create an hstore type to call contrib and core code
9.4 hstore has some differences from pre-9.4

The question is whether we change/improve hstore in 9.4, or create an
hstore2 that is the improved hstore for 9.4 and keep hstore identical to
pre-9.4.  That last option looks an awful like the dreaded VARCHAR2.

What can we do to help people migrate to an hstore type that supports
data types?  Is there a function we can give them to flag possible
problem data, or give them some function to format things the old way
for migrations, etc.  If they are going to have to rewrite all their old
data, why bother with a backward-compatible binary format?  Is it only
the client applications that will need to be changed?  How would we
instruct users on the necessary changes?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_ctl status with nonexistent data directory

2014-03-06 Thread Florian Pflug

On Mar6, 2014, at 00:08 , Bruce Momjian br...@momjian.us wrote:
 I have addressed this issue with the attached patch:
 
   $ pg_ctl -D /lkjasdf status
   pg_ctl: directory /lkjasdf does not exist
   $ pg_ctl -D / status
   pg_ctl: directory / is not a database cluster directory
 
 One odd question is that pg_ctl status has this comment for reporting
 the exit code for non-running servers:
 
printf(_(%s: no server running\n), progname);
 
/*
 * The Linux Standard Base Core Specification 3.1 says this should return
 * '3'
 * 
 https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
 */
exit(3);
 
 If they haven't passed us a data directory, we don't really know if the
 server is running or not, so the patch just returns '1'.

Why change the exit code at all in the ENOENT-case? If the directory
does not exist, it's fairly certain that the server is not running, so
3 seems fine. Technically, changing the return value is an API change
and might break things, so why do it if there's no clear benefit?

In the EPERM case (or, rather the non-ENOENT case), I agree with Amit
that 4 (meaning program or service status is unknown) fits much better
than 1 (meaning program is dead and /var/run pid file exists). So *if*
we change it at all, we should change it to 4, not to some other, equally
arbitrary value.

best regards,
Florian Pflug







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_ctl status with nonexistent data directory

2014-03-06 Thread Alvaro Herrera

Bruce Momjian escribió:

 Technically, you are right, but I tried a while ago to assign meaningful
 values to all the exit locations and the community feedback I got was
 that we didn't want that.

That sounds odd.  Do you have a link?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

Bruce Momjian br...@momjian.us writes:
 OK, just to summarize:

   JSONB and everything it shares with hstore will be in core
   hstore-specific code stays in contrib
   hstore contrib will create an hstore type to call contrib and core code
   9.4 hstore has some differences from pre-9.4

I've got a problem with the last part of that.  AFAICS, the value
proposition for hstore2 largely fails if it's not 100% upward compatible
with existing hstore, both as to on-disk storage and as to application-
visible behavior.  If you've got to adapt your application anyway, why
not switch to JSONB which is going to offer a lot of benefits in terms
of available code you can work with?

Although I've not looked at the patch, it was claimed upthread that there
were changes in the I/O format for existing test cases, for example.
IMO, that's an absolute dead no-go.

 The question is whether we change/improve hstore in 9.4, or create an
 hstore2 that is the improved hstore for 9.4 and keep hstore identical to
 pre-9.4.  That last option looks an awful like the dreaded VARCHAR2.

I think hstore2 as a separate type isn't likely to be a win either.

The bottom line here is that hstore2 is more or less what we'd agreed to
doing back at the last PGCon, but that decision has now been obsoleted by
events in the JSON area.  If jsonb gets in, I think we probably end up
rejecting hstore2 as such.  Or at least, that's what we should do IMO.
contrib/hstore is now a legacy type and we shouldn't be putting additional
work into it, especially not work that breaks backwards compatibility.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_ctl status with nonexistent data directory

Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Bruce Momjian escribió:
 Technically, you are right, but I tried a while ago to assign meaningful
 values to all the exit locations and the community feedback I got was
 that we didn't want that.

 That sounds odd.  Do you have a link?

FWIW, I recall that in my former life at Red Hat, I had to deal several
times with patching pg_ctl to make its exit codes more LSB-compliant.
And I think Debian does the same.  So Linux packagers, at least, would
like to see us pay more attention to that standard.  On the other hand,
there is no market for changes that make the exit codes more
meaningful unless the changes can be justified by chapter and verse
in LSB.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-06 Thread Greg Stark

On Wed, Mar 5, 2014 at 10:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 [ shrug... ]  They can see whether the Window plan node is where the time
 is going.  It's not apparent to me that the extra numbers you propose to
 report will edify anybody.

Perhaps just saying Incremental Window Function versus Iterated
Window Function or something like that be sufficient? At least that
way query tuning quidelines have a keyword they can say to watch out
for. And someone trying to figure out *why* the time is being spent in
this node has something they might notice a correlation with.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Ronan Dunklau

Le jeudi 6 mars 2014 09:33:18 Andrew Dunstan a écrit :
 On 03/06/2014 08:16 AM, Oleg Bartunov wrote:
  On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan p...@heroku.com wrote:
  On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote:
  That's possible to introduce GUC variable for i/o functions which will
  control old bug-to-bug behavior. IMHO, this is much better option that
  stopping hstore development or split hstore to two branches.
  
  A GUC that controls i/o functions is generally considered to be an
  unacceptable hack.
  
  In what sense are we really stopping hstore development if hstore2
  lives as jsonb? I have a hard time imagining someone dealing with the
  incompatibility that a user-facing hstore2 would introduce, while
  still preferring hstore syntax over json syntax given the choice.
  There are very rich facilities for manipulating json available in
  every programming language. The same is not true of hstore.
  
  Having looked at the issue today, I think that the amount of redundant
  code between a hstore2 in core as jsonb and hstore1 will be
  acceptable. The advantages of making a clean-break in having to
  support the legacy hstore disk format strengthen the case for doing so
  too.
  
  Heh, let's not to do an implusive decision about hstore2. I agree,
  that jsonb has
  a lot of facilities, but don't forget, that json(b) has to follow standard
  and in that sense it's more constrained than hstore, which we could
  further develop to support some interesting features, which will never be
  implemented in json(b).  Also,  it'd be a bit awkward after working on
  nested
  hstore and declaring it
  on several conferences (Engine Yard has sponsored part of our hstore
  work), suddenly
  break people expectation and say, that our work has moved to core to
  provide json
  some very cool features, good bye, hstore users :(   I'm afraid people
  will not understand us.
 
 Oleg,
 
 I hear you, and largely agree, as long as the compatibility issue is
 solved. If it's not, I think inventing a new hstore2 type is probably a
 lousy way to go.
 
 For good or ill, the world has pretty much settled on wanting to use
 json for lightweight treeish data. That's where we'll get the most
 impact. Virtually every programming language (including Perl) has good
 support for json.
 
 I'm not sure what the constraints of json that you might want to break
 are. Perhaps you'd like to specify.

I haven't followed the whole thread, but json is really restrictive on the 
supported types: a hierarchical hstore could maybe support more types 
(timestamp comes to mind) as its values, which is not a valid data type in the 
json spec.

 
 Whatever we do, rest assured your work won't go to waste.
 
 cheers
 
 andrew

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Magnus Hagander

On Thu, Mar 6, 2014 at 4:25 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Bruce Momjian br...@momjian.us writes:
  OK, just to summarize:

JSONB and everything it shares with hstore will be in core
hstore-specific code stays in contrib
hstore contrib will create an hstore type to call contrib and core
 code
9.4 hstore has some differences from pre-9.4

 I've got a problem with the last part of that.  AFAICS, the value
 proposition for hstore2 largely fails if it's not 100% upward compatible
 with existing hstore, both as to on-disk storage and as to application-
 visible behavior.  If you've got to adapt your application anyway, why
 not switch to JSONB which is going to offer a lot of benefits in terms
 of available code you can work with?

 Although I've not looked at the patch, it was claimed upthread that there
 were changes in the I/O format for existing test cases, for example.
 IMO, that's an absolute dead no-go.

  The question is whether we change/improve hstore in 9.4, or create an
  hstore2 that is the improved hstore for 9.4 and keep hstore identical to
  pre-9.4.  That last option looks an awful like the dreaded VARCHAR2.

 I think hstore2 as a separate type isn't likely to be a win either.

 The bottom line here is that hstore2 is more or less what we'd agreed to
 doing back at the last PGCon, but that decision has now been obsoleted by
 events in the JSON area.  If jsonb gets in, I think we probably end up
 rejecting hstore2 as such.  Or at least, that's what we should do IMO.
 contrib/hstore is now a legacy type and we shouldn't be putting additional
 work into it, especially not work that breaks backwards compatibility.


(not read up on the full details of the thread, sorry if I'm re-iterating
something)

I think we definitely want/need to maintain hstore compatibility. A
completely separate hstore2 type that's not backwards compatible makes very
little sense.

However, if the new hstore type (compatible with the old one) is the
wrapper around jsonb, rather than the other way around, I don't see any
problem with it at all. Most future users are almost certainly going to use
the json interfaces, but we don't want to leave upgraded users behind. (But
of course it has to actually maintain backwards compatibility for that
argument to hold)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] jsonb and nested hstore

Magnus Hagander mag...@hagander.net writes:
 However, if the new hstore type (compatible with the old one) is the
 wrapper around jsonb, rather than the other way around, I don't see any
 problem with it at all. Most future users are almost certainly going to use
 the json interfaces, but we don't want to leave upgraded users behind. (But
 of course it has to actually maintain backwards compatibility for that
 argument to hold)

Yeah --- all of this turns on whether hstore improvements can be 100%
upwards compatible or not.  If they are, I don't object to including them;
I'd have said it was wasted effort, but if the work is already done then
that's moot.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Andrew Dunstan



On 03/06/2014 10:46 AM, Tom Lane wrote:

Magnus Hagander mag...@hagander.net writes:

However, if the new hstore type (compatible with the old one) is the
wrapper around jsonb, rather than the other way around, I don't see any
problem with it at all. Most future users are almost certainly going to use
the json interfaces, but we don't want to leave upgraded users behind. (But
of course it has to actually maintain backwards compatibility for that
argument to hold)

Yeah --- all of this turns on whether hstore improvements can be 100%
upwards compatible or not.  If they are, I don't object to including them;
I'd have said it was wasted effort, but if the work is already done then
that's moot.



Clearly there are people who want it, or else they would not have 
sponsored the work.


We seem to have an emerging consensus on the compatibility issue.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE TYPE similar CHAR type

2014-03-06 Thread David Johnston

mohsencs wrote
 I want use CREATE TYPE to create one type similar to char.
 I want to when I create type, then my type behave similar to char:
 
 CREATE TABLE test (oneChar char);
 
 when I want insert one column with length1 to it, so it gets this error:
 ERROR:  value too long for type character(1)
 
 I want my type behave similar this but it behaves similar varchar type.

If you can get over the need for using CREATE TYPE you'll find that using
CREATE DOMAIN with a check constraint will probably meet your needs
perfectly.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/CREATE-TYPE-similar-CHAR-type-tp5794946p5794981.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Heikki Linnakangas


On 03/06/2014 05:46 PM, Tom Lane wrote:

Magnus Hagander mag...@hagander.net writes:

However, if the new hstore type (compatible with the old one) is the
wrapper around jsonb, rather than the other way around, I don't see any
problem with it at all. Most future users are almost certainly going to use
the json interfaces, but we don't want to leave upgraded users behind. (But
of course it has to actually maintain backwards compatibility for that
argument to hold)


Yeah --- all of this turns on whether hstore improvements can be 100%
upwards compatible or not.  If they are, I don't object to including them;


There are reasons for *not* wanting the new hstore2 functionality. If 
you don't want nesting, for example, with the new type you're going to 
need to add a constraint to forbid that. Ugh. Many applications are 
happy with the current functionality, a simple string key/value 
dictionary, and for them the new features are not an improvement.


As an analogy, adding significant new functionality like nesting to the 
existing hstore type is like suddenly adding the time of day to the date 
datatype. It might be useful in many cases. And an existing application 
can leave the hour and minute fields zero, so it's backwards-compatible. 
But as soon as someone inserts a datum that uses the hour and minute 
fields, it will confuse the application that doesn't know about that.


I haven't been following these discussions closely, but for those 
reasons, I thought hstore2 was going to be a separate type. I don't 
think there are very many applications that would be interested in 
upgrading from the current hstore to the new hstore2 type. More 
likely, the new data type is useful for many applications that couldn't 
have used hstore before because it didn't support nesting or was too 
loosely typed. And old applications that are already using hstore are 
perfectly happy with the status quo.


Let's not mess with the existing hstore datatype. For what it does, it 
works great.


Likewise, jsonb is significantly different from hstore2, so it should be 
a separate data type. Frankly I don't understand what the problem is 
with doing that. I don't have a problem with copy-pasting the common parts.


BTW, now that I look at the nested hstore patch, I'm disappointed to see 
that it only supports a few hardcoded datatypes. Call me naive, but 
somehow I thought it would support *all* PostgreSQL datatypes, built-in 
or user-defined. I realize that's a tall order, but that's what I 
thought it did. Since it doesn't, color me unimpressed. It's really not 
any better than json, I don't see why anyone would prefer it over json. 
Not that I particularly like json, but it's a format a lot of people are 
familiar with.


So here my opinion on what we should do:

1. Forget about hstore2
2. Add GIN and GIST operator classes to jsonb, if they're ready for 
commit pretty darn soon. If not, punt them to next release.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Securing make check (CVE-2014-0067)

Noah Misch n...@leadboat.com writes:
 Thanks.  To avoid socket path length limitations, I lean toward placing the
 socket temporary directory under /tmp rather than placing under the CWD:

I'm not thrilled with that; it's totally insecure on platforms where /tmp
isn't sticky, so it doesn't seem like an appropriate solution given
that this discussion is now being driven by security concerns.

 http://www.postgresql.org/message-id/flat/20121129223632.ga15...@tornado.leadboat.com

I re-read that thread.  While we did fix the reporting end of it, ie
the postmaster will now give you a clear failure message if your
socket path is too long, that's going to be cold comfort to anyone
who has to build in an environment they don't have much control over
(such as my still-hypothetical-I-hope scenario about Red Hat package
updates).

I'm inclined to suggest that we should put the socket under $CWD by
default, but provide some way for the user to override that choice.
If they want to put it in /tmp, it's on their head as to how secure
that is.  On most modern platforms it'd be fine.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Merlin Moncure

On Thu, Mar 6, 2014 at 11:28 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 So here my opinion on what we should do:

 1. Forget about hstore2
 2. Add GIN and GIST operator classes to jsonb, if they're ready for commit
 pretty darn soon. If not, punt them to next release.

For #2, would we maintain the hstore syntax for the searching
operators.  For example,

SELECT count(*) FROM jsonb_schema WHERE tabledata @ 'columns =
{{column_name=total_time}}';

Note the hstore-ish = in the searching operator.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

Hi there,

Looks like consensus is done. I and Teodor are not happy with it, but
what we can do :)   One thing I  want to do is to reserve our
contribution to the flagship feature (jsonb), particularly, binary
storage for nested structures and indexing. Their work was sponsored
by Engine Yard.

As for the old hstore I think it'd be nice to add gin_hstore_hash_ops,
so hstore users will benefit from 9.4 release. There is no
compatibiliy issue, so I think this could be harmless.

Oleg

On Thu, Mar 6, 2014 at 7:25 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 OK, just to summarize:

   JSONB and everything it shares with hstore will be in core
   hstore-specific code stays in contrib
   hstore contrib will create an hstore type to call contrib and core code
   9.4 hstore has some differences from pre-9.4

 I've got a problem with the last part of that.  AFAICS, the value
 proposition for hstore2 largely fails if it's not 100% upward compatible
 with existing hstore, both as to on-disk storage and as to application-
 visible behavior.  If you've got to adapt your application anyway, why
 not switch to JSONB which is going to offer a lot of benefits in terms
 of available code you can work with?

 Although I've not looked at the patch, it was claimed upthread that there
 were changes in the I/O format for existing test cases, for example.
 IMO, that's an absolute dead no-go.

 The question is whether we change/improve hstore in 9.4, or create an
 hstore2 that is the improved hstore for 9.4 and keep hstore identical to
 pre-9.4.  That last option looks an awful like the dreaded VARCHAR2.

 I think hstore2 as a separate type isn't likely to be a win either.

 The bottom line here is that hstore2 is more or less what we'd agreed to
 doing back at the last PGCon, but that decision has now been obsoleted by
 events in the JSON area.  If jsonb gets in, I think we probably end up
 rejecting hstore2 as such.  Or at least, that's what we should do IMO.
 contrib/hstore is now a legacy type and we shouldn't be putting additional
 work into it, especially not work that breaks backwards compatibility.

 regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

Kouhei Kaigai kai...@ak.jp.nec.com writes:
 I expected to include simple function pointers for copying and text-output
 as follows:

   typedef struct {
   Planplan;
 :
   NodeCopy_functionnode_copy;
   NodeTextOut_function node_textout;
   } Custom;

I was thinking more like

typedef struct CustomPathFuncs {
const char *name;   /* used for debugging purposes only */
NodeCopy_function node_copy;
NodeTextOut_function node_textout;
... etc etc etc ...
} CustomPathFuncs;

typedef struct CustomPath {
Path path;
const CustomPathFuncs *funcs;
... maybe a few more fields here, but not too darn many ...
} CustomPath;

and similarly for CustomPlan.

The advantage of this way is it's very cheap for (what I expect will be)
the common case where an extension has a fixed set of support functions
for its custom paths and plans.  It just declares a static constant
CustomPathFuncs struct, and puts a pointer to that into its paths.

If an extension really needs to set the support functions on a per-object
basis, it can do this:

typdef struct MyCustomPath {
   CustomPath cpath;
   CustomPathFuncs funcs;
   ... more fields ...
} MyCustomPath;

and then initialization of a MyCustomPath would include

mypath-cpath.funcs = mypath-funcs;
mypath-funcs.node_copy = MyCustomPathCopy;
... etc etc ...

In this case we're arguably wasting one pointer worth of space in the
path, but considering the number of function pointers such a path will be
carrying, I don't think that's much of an objection.

 So?  If you did that, then you wouldn't have renumbered the Vars as
 INNER/OUTER.  I don't believe that CUSTOM_VAR is necessary at all; if it
 is needed, then there would also be a need for an additional tuple slot
 in executor contexts, which you haven't provided.

 For example, the enhanced postgres_fdw fetches the result set of
 remote join query, thus a tuple contains the fields come from both side.
 In this case, what varno shall be suitable to put?

Not sure what we'd do for the general case, but CUSTOM_VAR isn't the
solution.  Consider for example a join where both tables supply columns
named id --- if you put them both in one tupledesc then there's no
non-kluge way to identify them.

Possibly the route to a solution involves adding another plan-node
callback function that ruleutils.c would use for printing Vars in custom
join nodes.  Or maybe we could let the Vars keep their original RTE
numbers, though that would complicate life at execution time.

Anyway, if we're going to punt on add_join_path_hook for the time
being, this problem can probably be left to solve later.  It won't
arise for simple table-scan cases, nor for single-input plan nodes
such as sorts.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Andrew Dunstan







On 03/06/2014 12:50 PM, Oleg Bartunov wrote:

Hi there,

Looks like consensus is done. I and Teodor are not happy with it, but
what we can do :)   One thing I  want to do is to reserve our
contribution to the flagship feature (jsonb), particularly, binary
storage for nested structures and indexing. Their work was sponsored
by Engine Yard.



We don't normally credit sponsors in commits, but if I'm doing the 
commit I promise you guys would certainly get major credit as authors.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

I meant in Release Notes for 9.4

On Thu, Mar 6, 2014 at 10:26 PM, Andrew Dunstan and...@dunslane.net wrote:





 On 03/06/2014 12:50 PM, Oleg Bartunov wrote:

 Hi there,

 Looks like consensus is done. I and Teodor are not happy with it, but
 what we can do :)   One thing I  want to do is to reserve our
 contribution to the flagship feature (jsonb), particularly, binary
 storage for nested structures and indexing. Their work was sponsored
 by Engine Yard.



 We don't normally credit sponsors in commits, but if I'm doing the commit I
 promise you guys would certainly get major credit as authors.


 cheers

 andrew



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Josh Berkus

On 03/06/2014 07:00 AM, Bruce Momjian wrote:
 What can we do to help people migrate to an hstore type that supports
 data types?  Is there a function we can give them to flag possible
 problem data, or give them some function to format things the old way
 for migrations, etc.  If they are going to have to rewrite all their old
 data, why bother with a backward-compatible binary format?  Is it only
 the client applications that will need to be changed?  How would we
 instruct users on the necessary changes?

So, from what I've been able to check:

The actual storage upgrade of hstore--hstore2 is fairly painless from
the user perspective; they don't have to do anything.  The problem is
that the input/output strings are different, something which I didn't
think to check for (and Peter did), and which will break applications
relying on Hstore, since the drivers which support Hstore (like
psycopg2) rely on string-parsing to convert it.  I haven't
regression-tested hstore2 against psycopg2 since I don't have a good
test, but that would be a useful thing to do.

Hstore2 supports the same limited data types as JSON does, and not any
additional ones.

This makes an hstore2 of dubious value unless the compatibility issues
can be solved conclusively.

Is that all correct?  Have I missed something?

On 03/06/2014 09:50 AM, Oleg Bartunov wrote: Looks like consensus is
done. I and Teodor are not happy with it, but
 what we can do :)   One thing I  want to do is to reserve our
 contribution to the flagship feature (jsonb), particularly, binary
 storage for nested structures and indexing. Their work was sponsored
 by Engine Yard.

We don't generally credit companies in the release notes, since if we
started, where would we stop?  However, we *do* credit them in the press
release, and I'll make a note of the EY sponsorship, especially since
it's also good PR.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

On Thu, Mar 6, 2014 at 10:54 PM, Josh Berkus j...@agliodbs.com wrote:
g?

 On 03/06/2014 09:50 AM, Oleg Bartunov wrote: Looks like consensus is
 done. I and Teodor are not happy with it, but
 what we can do :)   One thing I  want to do is to reserve our
 contribution to the flagship feature (jsonb), particularly, binary
 storage for nested structures and indexing. Their work was sponsored
 by Engine Yard.

 We don't generally credit companies in the release notes, since if we
 started, where would we stop?  However, we *do* credit them in the press
 release, and I'll make a note of the EY sponsorship, especially since
 it's also good PR.

I think press release is fine. We waited a long time for sponsorship
of our work and EY help was crucial.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC on WAL-logging hash indexes

On Thu, Mar 6, 2014 at 8:11 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
I don't think it's necessary to improve concurrency just to get WAL-logging.
Better concurrency is a worthy goal of its own, of course, but it's a
separate concern.

To some extent, I agree, but only to some extent. To make hash
indexes generally usable, we really need to solve both problems. When
I got rid of just some of the heavyweight locking in commit
76837c1507cb5a5a0048046233568447729e66dd, the results were pretty
dramatic at higher levels of concurrency:

http://www.postgresql.org/message-id/CA+Tgmoaf=nojxlyzgcbrry+pe-0vll0vfhi6tjdm3fftvws...@mail.gmail.com

But there was still an awful lot of contention inside the heavyweight
lock manager, and I don't think hash indexes are going to be ready for
prime time until we solve that problem.

This is similar to your description, as you scan both buckets if you see an
interrupted split, but doesn't require the per-tuple moved-by-split flag, or
waiting out scans. Also, I put the split-in-progress flag in the new
bucket's primary page instead of the metapage. That allows multiple splits
to be in progress at the same time.

Putting the split-in-progress flag in the new bucket's primary page
makes a lot of sense. I don't have any problem with dumping the rest
of it for a first cut if we have a different long-term plan for how to
improve concurrency, but I don't see much point in going to a lot of
work to implement a system for WAL logging if we're going to end up
having to afterwards throw it out and start from scratch to get rid of
the heavyweight locks - and it's not obvious to me how what you have
here could be extended to do that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC proposal - make an unlogged table logged

On Wed, Mar 5, 2014 at 7:42 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
 On Tue, Mar 4, 2014 at 5:00 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
 On Tue, Mar 4, 2014 at 3:29 PM, Andres Freund and...@2ndquadrant.com
 wrote:
 
  On 2014-03-04 12:54:02 -0500, Robert Haas wrote:
   On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund and...@2ndquadrant.com
   wrote:
On 2014-03-04 09:47:08 -0500, Robert Haas wrote:
Can't that be solved by just creating the permanent relation in a
new
relfilenode? That's equivalent to a rewrite, yes, but we need to do
that
for anything but wal_level=minimal anyway.
  
   Yes, that would work.  I've tended to view optimizing away the
   relfilenode copy as an indispensable part of this work, but that might
   be wrongheaded.  It would certainly be a lot easier to make this
   happen if we didn't insist on that.
 
  I think it'd already much better than today's situation, and it's a
  required codepath for wal_level  logical anyway. So even if somebody
  wants to make this work without the full copy for minimal, it'd still be
  a required codepath. So I am perfectly ok with a patch just adding that.
 

 Then is this a good idea for a GSoC project ?

 I don't know very well this internals, but I am willing to learn and I
 think the GSoC is a good opportunity.

 Any of you are willing to mentoring this project?


 I written the proposal to this feature, so I would like to know if someone
 can review.

I think this isn't a good design.  Per the discussion between Andres
and I, I think that I think you should do is make ALTER TABLE .. SET
LOGGED work just like VACUUM FULL, with the exception that it will set
a different relpersistence for the new relfilenode.  If you do it that
way, this will be less efficient, but much simpler, and you might
actually finish it in one summer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC proposal - make an unlogged table logged

2014-03-06 Thread Thom Brown

On 6 March 2014 19:42, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Mar 5, 2014 at 7:42 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  On Tue, Mar 4, 2014 at 5:00 PM, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
  On Tue, Mar 4, 2014 at 3:29 PM, Andres Freund and...@2ndquadrant.com
  wrote:
  
   On 2014-03-04 12:54:02 -0500, Robert Haas wrote:
On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund 
 and...@2ndquadrant.com
wrote:
 On 2014-03-04 09:47:08 -0500, Robert Haas wrote:
 Can't that be solved by just creating the permanent relation in a
 new
 relfilenode? That's equivalent to a rewrite, yes, but we need to
 do
 that
 for anything but wal_level=minimal anyway.
   
Yes, that would work.  I've tended to view optimizing away the
relfilenode copy as an indispensable part of this work, but that
 might
be wrongheaded.  It would certainly be a lot easier to make this
happen if we didn't insist on that.
  
   I think it'd already much better than today's situation, and it's a
   required codepath for wal_level  logical anyway. So even if somebody
   wants to make this work without the full copy for minimal, it'd still
 be
   a required codepath. So I am perfectly ok with a patch just adding
 that.
  
 
  Then is this a good idea for a GSoC project ?
 
  I don't know very well this internals, but I am willing to learn and I
  think the GSoC is a good opportunity.
 
  Any of you are willing to mentoring this project?
 
 
  I written the proposal to this feature, so I would like to know if
 someone
  can review.

 I think this isn't a good design.  Per the discussion between Andres
 and I, I think that I think you should do is make ALTER TABLE .. SET
 LOGGED work just like VACUUM FULL, with the exception that it will set
 a different relpersistence for the new relfilenode.  If you do it that
 way, this will be less efficient, but much simpler, and you might
 actually finish it in one summer.


Sounds like a plan.  Would there be any stretch-goals for this work, or is
there not really anything else that could be done?

-- 
Thom

Re: [HACKERS] walsender doesn't send keepalives when writes are pending

2014-03-06 Thread Heikki Linnakangas


On 03/05/2014 10:57 PM, Andres Freund wrote:

On 2014-03-05 18:26:13 +0200, Heikki Linnakangas wrote:

The logic was the same before the patch, but I added the XXX comment above.
Why do we sleep in increments of 1/10 of wal_sender_timeout? Originally, the
code calculated when the next wakeup should happen, by adding
wal_sender_timeout (or replication_timeout, as it was called back then) to
the time of the last reply. Why don't we do that?
[ archeology ]


It imo makes sense to wakeup after last_reply + wal_sender_timeout/2, so
a requested reply actually has time to arrive, but otherwise I agree.

I think your patch makes sense. Additionally imo the timeout checking
should be moved outside the if (caughtup || pq_is_send_pending()), but
that's probably a separate patch.

Any chance you could apply your patch soon? I've a patch pending that'll
surely conflict with this and it seems better to fix it first.


Ok, pushed. I left the polling-style sleep in place for now.

Thanks!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC proposal - make an unlogged table logged

On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas robertmh...@gmail.com wrote:


 I think this isn't a good design.  Per the discussion between Andres
 and I, I think that I think you should do is make ALTER TABLE .. SET
 LOGGED work just like VACUUM FULL, with the exception that it will set
 a different relpersistence for the new relfilenode.  If you do it that
 way, this will be less efficient, but much simpler, and you might
 actually finish it in one summer.


Do it like 'VACUUM FULL' for any wal_level?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] GSoC proposal - make an unlogged table logged

On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
 On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas robertmh...@gmail.com wrote:
 I think this isn't a good design.  Per the discussion between Andres
 and I, I think that I think you should do is make ALTER TABLE .. SET
 LOGGED work just like VACUUM FULL, with the exception that it will set
 a different relpersistence for the new relfilenode.  If you do it that
 way, this will be less efficient, but much simpler, and you might
 actually finish it in one summer.


 Do it like 'VACUUM FULL' for any wal_level?

Yep.  Anything else appears to be a research problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Fwd: [HACKERS] patch: make_timestamp function

On Thu, Mar 6, 2014 at 1:26 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2014-03-05 16:22 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule escribió:
  Hi
 
  I hope, so this patch fix it

 wtf?


 I tried to fix
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f1ba94bcd9717b94b36868d6905547e313f3a359

 Tom did it better than me.

The patch you attached was one from Heikki, not anything you wrote for
yourself, and utterly unrelated to the topic of this thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC proposal - make an unlogged table logged

On Thu, Mar 6, 2014 at 5:04 PM, Robert Haas robertmh...@gmail.com wrote:

 On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas robertmh...@gmail.com
wrote:
  I think this isn't a good design.  Per the discussion between Andres
  and I, I think that I think you should do is make ALTER TABLE .. SET
  LOGGED work just like VACUUM FULL, with the exception that it will set
  a different relpersistence for the new relfilenode.  If you do it that
  way, this will be less efficient, but much simpler, and you might
  actually finish it in one summer.
 
 
  Do it like 'VACUUM FULL' for any wal_level?

 Yep.  Anything else appears to be a research problem.


I'll change the proposal. Thanks a lot!

Grettings,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

Re: Fwd: [HACKERS] patch: make_timestamp function

2014-03-06 Thread Pavel Stehule

2014-03-06 21:06 GMT+01:00 Robert Haas robertmh...@gmail.com:

 On Thu, Mar 6, 2014 at 1:26 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  2014-03-05 16:22 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com:
 
  Pavel Stehule escribió:
   Hi
  
   I hope, so this patch fix it
 
  wtf?
 
 
  I tried to fix
 
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f1ba94bcd9717b94b36868d6905547e313f3a359
 
  Tom did it better than me.

 The patch you attached was one from Heikki, not anything you wrote for
 yourself, and utterly unrelated to the topic of this thread.


yes, sorry - it is some git issue on my side (I had to use  wrong hash). I
did changes similar to Tom fix, but patch was some other than I did.

Regards

Pavel





 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

Re: [HACKERS] GSoC on WAL-logging hash indexes

2014-03-06 Thread Jeff Janes

On Thu, Mar 6, 2014 at 11:34 AM, Robert Haas robertmh...@gmail.com wrote:


 Putting the split-in-progress flag in the new bucket's primary page
 makes a lot of sense.  I don't have any problem with dumping the rest
 of it for a first cut if we have a different long-term plan for how to
 improve concurrency, but I don't see much point in going to a lot of
 work to implement a system for WAL logging if we're going to end up
 having to afterwards throw it out and start from scratch to get rid of
 the heavyweight locks - and it's not obvious to me how what you have
 here could be extended to do that.


+1   I don't think we have to improve concurrency at the same time as WAL
logging, but we at least have to implement WAL logging in a way that
doesn't foreclose future improvements to concurrency.

I've been tempted to implement a new type of hash index that allows both
WAL and high concurrency, simply by disallowing bucket splits.  At the
index creation time you use a storage parameter to specify the number of
buckets, and that is that. If you mis-planned, build a new index with more
buckets, possibly concurrently, and drop the too-small one.

I think it would be easier to add bucket splitting to something like this
than it would be to add WAL logging and concurrency to what we already
have--mostly because I think that route would be more amenable to
incremental programming efforts, and also because if people had an actually
useful hash index type they would use it and see that it worked well
(assuming of course that it does work well), and then be motivated to
improve it.

If this could be done as an extension I would just go (attempt to) do it.
 But since WAL isn't pluggable, it can't go in as an extension.

Cheers,

Jeff

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Daniele Varrazzo

On Thu, Mar 6, 2014 at 6:54 PM, Josh Berkus j...@agliodbs.com wrote:

 The actual storage upgrade of hstore--hstore2 is fairly painless from
 the user perspective; they don't have to do anything.  The problem is
 that the input/output strings are different, something which I didn't
 think to check for (and Peter did), and which will break applications
 relying on Hstore, since the drivers which support Hstore (like
 psycopg2) rely on string-parsing to convert it.  I haven't
 regression-tested hstore2 against psycopg2 since I don't have a good
 test, but that would be a useful thing to do.

Hello, psycopg developer here. Not following the entire thread as it's
quite articulated and not of my direct interest (nor comprehension).
But if you throw at me a few test cases I can make sure psycopg can
parse them much before hstore2 is released.

FYI I have a trigger that highlights me the -hackers messages
mentioning psycopg, so just mentioning it is enough for me to take a
better look. But if you want a more active collaboration just ask.

Thank you,

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Peter Geoghegan

On Thu, Mar 6, 2014 at 12:58 PM, Daniele Varrazzo
daniele.varra...@gmail.com wrote:
 Hello, psycopg developer here. Not following the entire thread as it's
 quite articulated and not of my direct interest (nor comprehension).
 But if you throw at me a few test cases I can make sure psycopg can
 parse them much before hstore2 is released.

I don't think that'll be necessary. Any break in compatibility in the
hstore format has been ruled a non-starter for having hstore support
nested data structures. I believe on balance we're content to let
hstore continue to be hstore. jsonb support would certainly be
interesting, though.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Daniele Varrazzo

On Thu, Mar 6, 2014 at 9:10 PM, Peter Geoghegan p...@heroku.com wrote:
 On Thu, Mar 6, 2014 at 12:58 PM, Daniele Varrazzo
 daniele.varra...@gmail.com wrote:
 Hello, psycopg developer here. Not following the entire thread as it's
 quite articulated and not of my direct interest (nor comprehension).
 But if you throw at me a few test cases I can make sure psycopg can
 parse them much before hstore2 is released.

 I don't think that'll be necessary. Any break in compatibility in the
 hstore format has been ruled a non-starter for having hstore support
 nested data structures. I believe on balance we're content to let
 hstore continue to be hstore. jsonb support would certainly be
 interesting, though.

Cool, just let me know what you would expect a well-behaved client
library to behave.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rowcounts marked by create_foreignscan_path()

Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
 Maybe I'm missing something.  But I don't think
 postgresGetForeignPaths() marks the parameterized path with the correct
 row estimate.  Also, that function doesn't seem to estimate the cost of
 the parameterized path correctly.  Please find attached a patch.

[ looks at that... ]  Oh, I see what you're worried about: we select
a specific safe join clause to drive creation of the parameterized
path, but then there might be additional join clauses that come along
with that because they're also movable to the parameterized path.
You're right, the current postgres_fdw code is not accounting for that.

I experimented with this patch using this test case in the database
created by the postgres_fdw regression test:

explain verbose SELECT * FROM ft2 a, ft2 b WHERE
  a.c2 = 47 AND b.c1 = a.c1 and a.c6 = 'fooo' and b.c7=a.c7;

What I saw in this example was that with the patch, the join clause got
counted *twice* in the cost estimate, because the clause returned by
generate_implied_equalities_for_column() isn't pointer-equal to the one in
the ppi_clauses list (note the comment about such cases in
postgresGetForeignPlan).  So that's not right.

However, we've got bigger problems than that: with or without the patch,

explain verbose SELECT * FROM ft2 a, ft2 b WHERE
  a.c2 = 47 AND b.c1 = a.c1 and a.c6 = 'fooo' and b.c7=upper(a.c7);

fails with
TRAP: FailedAssertion(!(is_foreign_expr(root, baserel, rinfo-clause)), File: 
postgres_fdw.c, Line: 759)

This happens because we select the safe joinclause b.c1=a.c1, and then
the PPI machinery adds all the other movable join clauses for that outer
relation, including the unsafe one involving an upper() call.
postgresGetForeignPlan believes that the only joinclauses it can find in
scan_clauses are the safe one(s) selected by postgresGetForeignPaths, but
that's completely wrong.  I think this is probably relatively simple to
fix: in postgresGetForeignPlan, we need to actually test whether a join
clause is safe or not, and shove it into the appropriate list.

So far as the original issue goes, I think what this consideration shows
is that postgresGetForeignPaths is going about things incorrectly.
I thought that considering only one join clause per path would be valid,
if perhaps sometimes dumb.  But the way the movable-clause machinery works,
we don't get to consider only one join clause; it's all-or-nothing for a
given outer relation.  So what we really ought to be doing here is
generating one parameterized path per valid outer relation.

What we could possibly do to resolve this without much extra code is to
keep using the same logic as a guide to which outer relations are
interesting, but for each such relation, do get_baserel_parampathinfo()
and then use the ppi_clauses list as the quals-to-apply for purposes of
cost/selectivity estimation.  postgresGetForeignPaths would have to check
each of those clauses for safety rather than assuming anything.  One
possible benefit is that we could record the results in the path and not
have to repeat the is_foreign_expr tests in postgresGetForeignPlan.
I'm not entirely sure if that's worth the trouble though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe Reply-To:

2014-03-06 Thread Noah Misch

On Tue, Mar 04, 2014 at 06:50:17PM +0100, Andres Freund wrote:
 On 2014-03-04 11:40:10 -0500, Tom Lane wrote:
  Robert Haas robertmh...@gmail.com writes:  I think this is all too
  late for 9.4, though.
  
  I agree with the feeling that a meaningful fix for pg_dump isn't going
  to get done for 9.4.  So that leaves us with the alternatives of (1)
  put off the lock-strength-reduction patch for another year; (2) push
  it anyway and accept a reduction in pg_dump reliability.
  
  I don't care for (2).  I'd like to have lock strength reduction as
  much as anybody, but it can't come at the price of reduction of
  reliability.
 
 I am sorry, but I think this is vastly overstating the scope of the
 pg_dump problem. CREATE INDEX *already* doesn't require a AEL, and the
 amount of problems that has caused in the past is surprisingly low. If
 such a frequently used command doesn't cause problems, why are you
 assuming other commands to be that problematic? And I think it's hard to
 argue that the proposed changes are more likely to cause problems.
 
 Let's try to go at this a bit more methodically. The commands that -
 afaics - change their locklevel due to latest patch (v21) are:
[snip]

Good analysis.  The hazards arise when pg_dump uses one of the ruleutils.c
deparse worker functions.  As a cross-check to your study, I looked briefly at
the use of those functions in pg_dump and how this patch might affect them:

-- pg_get_constraintdef()

pg_dump reads the constraint OID with its transaction snapshot, so we will
never see a too-new constraint.  Dropping a constraint still requires
AccessExclusiveLock.

Concerning VALIDATE CONSTRAINT, pg_dump reads convalidated with its
transaction snapshot and uses that to decide whether to dump the CHECK
constraint as part of the CREATE TABLE or as a separate ALTER TABLE ADD
CONSTRAINT following the data load.  However, pg_get_constraintdef() reads the
latest convalidated to decide whether to emit NOT VALID.  Consequently, one
can get a dump in which the dumped table data did not yet conform to the
constraint, and the ALTER TABLE ADD CONSTRAINT (w/o NOT VALID) fails.
(Suppose you deleted the last invalid rows just before executing the VALIDATE
CONSTRAINT.  I tested this by committing the DELETE + VALIDATE CONSTRAINT with
pg_dump stopped at getTableAttrs().)

One hacky, but maintainable and effective, solution to the VALIDATE CONSTRAINT
problem is to have pg_dump tack on a NOT VALID if pg_get_constraintdef() did
not do so.  It's, conveniently, the last part of the definition.  I would tend
to choose this.  We could also just decide this isn't bad enough to worry
about.  The consequence is that an ALTER TABLE ADD CONSTRAINT fails.  Assuming
no --single-transaction for the original restoral, you just add NOT VALID to
the command and rerun.  Like most of the potential new pg_dump problems, this
can already happen today if the relevant database changes happen between
taking the pg_dump transaction snapshot and locking the tables.

-- pg_get_expr() for default expressions

pg_dump reads pg_attrdef.adbin using its transaction snapshot, so it will
never see a too-new default.  This does allow us to read a dropped default.
That's not a problem directly.  However, suppose the default references a
function dropped at the same time as the default.  pg_dump could fail in
pg_get_expr().

-- pg_get_indexdef()

As you explained elsewhere, new indexes are no problem.  DROP INDEX still
requires AccessExclusiveLock.  Overall, no problems here.

-- pg_get_ruledef()

The patch changes lock requirements for enabling and disabling of rules, but
that is all separate from the rule expression handling.  No problems.

-- pg_get_triggerdef()

The patch reduces CREATE TRIGGER and DROP TRIGGER to ShareUpdateExclusiveLock.
The implications for pg_dump are similar to those for pg_get_expr().

-- pg_get_viewdef()

Untamed: pg_dump does not lock views at all.


One thing not to forget is that you can always get the old mutual exclusion
back by issuing LOCK TABLE just before a DDL operation.  If some unlucky user
regularly gets pg_dump failures due to concurrent DROP TRIGGER, he has a
workaround.  There's no comparable way for someone who would not experience
that problem to weaken the now-hardcoded AccessExclusiveLock.  Many
consequences of insufficient locking are too severe for that workaround to
bring comfort, but the pg_dump failure scenarios around pg_get_expr() and
pg_get_triggerdef() seem mild enough.  Restore-time failures are more serious,
hence my recommendation to put a hack in pg_dump around VALIDATE CONSTRAINT.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC on WAL-logging hash indexes

On Thu, Mar 6, 2014 at 3:44 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Thu, Mar 6, 2014 at 11:34 AM, Robert Haas robertmh...@gmail.com wrote:
 Putting the split-in-progress flag in the new bucket's primary page
 makes a lot of sense.  I don't have any problem with dumping the rest
 of it for a first cut if we have a different long-term plan for how to
 improve concurrency, but I don't see much point in going to a lot of
 work to implement a system for WAL logging if we're going to end up
 having to afterwards throw it out and start from scratch to get rid of
 the heavyweight locks - and it's not obvious to me how what you have
 here could be extended to do that.

 +1   I don't think we have to improve concurrency at the same time as WAL
 logging, but we at least have to implement WAL logging in a way that doesn't
 foreclose future improvements to concurrency.

 I've been tempted to implement a new type of hash index that allows both WAL
 and high concurrency, simply by disallowing bucket splits.  At the index
 creation time you use a storage parameter to specify the number of buckets,
 and that is that. If you mis-planned, build a new index with more buckets,
 possibly concurrently, and drop the too-small one.

Yeah, we could certainly do something like that.  It sort of sucks,
though.  I mean, it's probably pretty easy to know that starting with
the default 2 buckets is not going to be enough; most people will at
least be smart enough to start with, say, 1024.  But are you going to
know whether you need 32768 or 1048576 or 33554432?  A lot of people
won't, and we have more than enough reasons for performance to degrade
over time as it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe Reply-To:

2014-03-06 Thread Simon Riggs

On 6 March 2014 22:43, Noah Misch n...@leadboat.com wrote:

 Good analysis.  The hazards arise when pg_dump uses one of the ruleutils.c
 deparse worker functions.

Ah, good. We're thinking along the same lines. I was already working
on this analysis also. I'll complete mine and then compare notes.

 One thing not to forget is that you can always get the old mutual exclusion
 back by issuing LOCK TABLE just before a DDL operation.  If some unlucky user
 regularly gets pg_dump failures due to concurrent DROP TRIGGER, he has a
 workaround.  There's no comparable way for someone who would not experience
 that problem to weaken the now-hardcoded AccessExclusiveLock.

Good point.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC on WAL-logging hash indexes

2014-03-06 Thread Greg Stark

On Thu, Mar 6, 2014 at 11:14 PM, Robert Haas robertmh...@gmail.com wrote:
 I've been tempted to implement a new type of hash index that allows both WAL
 and high concurrency, simply by disallowing bucket splits.  At the index
 creation time you use a storage parameter to specify the number of buckets,
 and that is that. If you mis-planned, build a new index with more buckets,
 possibly concurrently, and drop the too-small one.

 Yeah, we could certainly do something like that.  It sort of sucks,
 though.  I mean, it's probably pretty easy to know that starting with
 the default 2 buckets is not going to be enough; most people will at
 least be smart enough to start with, say, 1024.  But are you going to
 know whether you need 32768 or 1048576 or 33554432?  A lot of people
 won't, and we have more than enough reasons for performance to degrade
 over time as it is.

The other thought I had was that you can do things lazily in vacuum.
So when you probe you need to check multiple pages until vacuum comes
along and rehashes everything.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Josh Berkus

On 03/06/2014 12:58 PM, Daniele Varrazzo wrote:
 On Thu, Mar 6, 2014 at 6:54 PM, Josh Berkus j...@agliodbs.com wrote:

 The actual storage upgrade of hstore--hstore2 is fairly painless from
 the user perspective; they don't have to do anything.  The problem is
 that the input/output strings are different, something which I didn't
 think to check for (and Peter did), and which will break applications
 relying on Hstore, since the drivers which support Hstore (like
 psycopg2) rely on string-parsing to convert it.  I haven't
 regression-tested hstore2 against psycopg2 since I don't have a good
 test, but that would be a useful thing to do.
 
 Hello, psycopg developer here. Not following the entire thread as it's
 quite articulated and not of my direct interest (nor comprehension).
 But if you throw at me a few test cases I can make sure psycopg can
 parse them much before hstore2 is released.

Looks like that won't be necessary, Daniele.  But thanks for speaking up!

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-06 Thread Kouhei Kaigai

 I was thinking more like
 
 typedef struct CustomPathFuncs {
   const char *name;   /* used for debugging purposes only */
   NodeCopy_function node_copy;
   NodeTextOut_function node_textout;
   ... etc etc etc ...
 } CustomPathFuncs;
 
 typedef struct CustomPath {
   Path path;
   const CustomPathFuncs *funcs;
   ... maybe a few more fields here, but not too darn many ...
 } CustomPath;
 
 and similarly for CustomPlan.
 
 The advantage of this way is it's very cheap for (what I expect will be)
 the common case where an extension has a fixed set of support functions
 for its custom paths and plans.  It just declares a static constant
 CustomPathFuncs struct, and puts a pointer to that into its paths.
 
 If an extension really needs to set the support functions on a per-object
 basis, it can do this:
 
 typdef struct MyCustomPath {
CustomPath cpath;
CustomPathFuncs funcs;
... more fields ...
 } MyCustomPath;
 
 and then initialization of a MyCustomPath would include
 
   mypath-cpath.funcs = mypath-funcs;
   mypath-funcs.node_copy = MyCustomPathCopy;
   ... etc etc ...
 
 In this case we're arguably wasting one pointer worth of space in the path,
 but considering the number of function pointers such a path will be carrying,
 I don't think that's much of an objection.
 
That is exactly same as my expectation, and no objection here.
Thanks for your clarification.


  So?  If you did that, then you wouldn't have renumbered the Vars as
  INNER/OUTER.  I don't believe that CUSTOM_VAR is necessary at all; if
  it is needed, then there would also be a need for an additional tuple
  slot in executor contexts, which you haven't provided.
 
  For example, the enhanced postgres_fdw fetches the result set of
  remote join query, thus a tuple contains the fields come from both side.
  In this case, what varno shall be suitable to put?
 
 Not sure what we'd do for the general case, but CUSTOM_VAR isn't the solution.
 Consider for example a join where both tables supply columns named id
 --- if you put them both in one tupledesc then there's no non-kluge way
 to identify them.
 
 Possibly the route to a solution involves adding another plan-node callback
 function that ruleutils.c would use for printing Vars in custom join nodes.
 Or maybe we could let the Vars keep their original RTE numbers, though that
 would complicate life at execution time.
 
My preference is earlier one, because complication in execution time may
make performance degradation.
Once two tuples get joined in custom-node, only extension can know which
relation is the origin of a particular attribute in the unified tuple.
So, it seems to me reasonable extension has to provide a hint to resolve
the Var naming.
Probably, another callback that provides a translation table from a Var
node that reference custom-plan but originated from either of subtree.
(It looks like a translated_vars in prepunion.c)

For example, let's assume here is a Var node with INDEX_VAR in the tlist
of custom-plan. It eventually references ecxt_scantuple in the execution
time, and this tuple-slot will keep a joined tuple being originated from
two relations. If its varattno=9 came from the column varno=1/varatno=3,
we like to print its original name. If we can have a translation table
like translated_vars, it allows to translate an attribute number on the
custom-plan into its original ones.
Even it might be abuse of INDEX_VAR, it seems to me a good idea.
Also, I don't like to re-define the meaning of INNER_VAR/OUTER_VAR
because custom-plan may have both of left-/right-subtree, thus it makes
sense to support a case when both of tupleslots are available.

 Anyway, if we're going to punt on add_join_path_hook for the time being,
 this problem can probably be left to solve later.  It won't arise for simple
 table-scan cases, nor for single-input plan nodes such as sorts.
 
Yes, it is a problem if number of input plans is larger then 1.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC proposal - make an unlogged table logged

On Thu, Mar 6, 2014 at 5:05 PM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:


 On Thu, Mar 6, 2014 at 5:04 PM, Robert Haas robertmh...@gmail.com wrote:
 
  On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
   On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas robertmh...@gmail.com
wrote:
   I think this isn't a good design.  Per the discussion between Andres
   and I, I think that I think you should do is make ALTER TABLE .. SET
   LOGGED work just like VACUUM FULL, with the exception that it will
set
   a different relpersistence for the new relfilenode.  If you do it
that
   way, this will be less efficient, but much simpler, and you might
   actually finish it in one summer.
  
  
   Do it like 'VACUUM FULL' for any wal_level?
 
  Yep.  Anything else appears to be a research problem.
 

 I'll change the proposal. Thanks a lot!


One last question.

Do you think is difficult to implement ALTER TABLE ... SET UNLOGGED too?
Thinking in a scope of one GSoC, of course.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

[HACKERS] Clustering and PostgresXC meetings at pgCon 2014

2014-03-06 Thread Josh Berkus

Clustering Summit
-

We will be holding the fourth annual Cluster-Hacker meeting on Tuesday
May 20th, just before pgCon 2014.  Everyone with a serious interest in
clustering, replication, and scale-out technologies is welcome and
encouraged to attend.

This meeting will take place from 2pm to 6pm, in a meeting room on the
University of Ottawa campus.  The agenda will be set by the participants
using the Clustering Summit Wiki Page:
https://wiki.postgresql.org/wiki/PgCon2014ClusterSummit

RSVP on the wiki page if you plan to attend.  We need a headcount for
the room size and for lunch (see below)

PostgresXC Meeting
--

The Cluster-Hackers meeting will be preceded by the PostgresXC
Developer's Meeting from 9:30am to 1pm, in the same location.  This
meeting is for all contributors and beta-testers of PostgresXC to
discuss the current state of development and plans for its future.  The
agenda will be set by the participants using the PostgresXC Meeting Wiki
Page https://wiki.postgresql.org/wiki/Pgcon2014PostgresXCmeeting

For those attending both meetings who RSVP to both, box lunches will be
provided.

PostgresXC Pizza Demo
-

The evening of Tuesday, May 20th, we will follow up our meetings with a
demo of PostgresXC's current capabilities over pizza.  All pgCon
attendees are invited to join us.  Location and exact time TBD; check
the pgCon schedule for updates.

All of the above is sponsored by NTT Open Source.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Comment - uniqueness of relfilenode

On Thu, Mar  6, 2014 at 08:34:29AM +0100, Antonin Houska wrote:
  You're right. I missed the fact that Postgres (unlike another DBMS that
  I worked with) allows for tablespace to be shared across databases.
 
  I have update the C comment:
 
* Notice that relNode is only unique within a particular 
  database.
  ---
* Notice that relNode is only unique within a particular 
  tablespace.
  
  Yep. But the new text is no more correct than the old text.  Did you
  read what I wrote upthread?
 
 Perhaps ... unique within a particular (tablespace, database)
 combination. ?

Oh, I thought people just wanted the text to be consistent with other
mentions, rather than improving it.  Attached is a patch which updates
comments about the tablespace/database issue.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
new file mode 100644
index e3002eb..282e0b9
*** a/src/backend/catalog/catalog.c
--- b/src/backend/catalog/catalog.c
*** GetNewOidWithIndex(Relation relation, Oi
*** 408,415 
  
  /*
   * GetNewRelFileNode
!  *		Generate a new relfilenode number that is unique within the given
!  *		tablespace.
   *
   * If the relfilenode will also be used as the relation's OID, pass the
   * opened pg_class catalog, and this routine will guarantee that the result
--- 408,415 
  
  /*
   * GetNewRelFileNode
!  *		Generate a new relfilenode number that is unique within the
!  *		database of the given tablespace.
   *
   * If the relfilenode will also be used as the relation's OID, pass the
   * opened pg_class catalog, and this routine will guarantee that the result
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
new file mode 100644
index 1de3170..25f01e5
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
*** ATExecSetTableSpace(Oid tableOid, Oid ne
*** 9078,9085 
  	FlushRelationBuffers(rel);
  
  	/*
! 	 * Relfilenodes are not unique across tablespaces, so we need to allocate
! 	 * a new one in the new tablespace.
  	 */
  	newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
  	   rel-rd_rel-relpersistence);
--- 9078,9085 
  	FlushRelationBuffers(rel);
  
  	/*
! 	 * Relfilenodes are not unique in databases across tablespaces, so we
! 	 * need to allocate a new one in the new tablespace.
  	 */
  	newrelfilenode = GetNewRelFileNode(newTableSpace, NULL,
  	   rel-rd_rel-relpersistence);
diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h
new file mode 100644
index c995c10..8616bd3
*** a/src/include/storage/relfilenode.h
--- b/src/include/storage/relfilenode.h
*** typedef enum ForkNumber
*** 55,61 
   * relNode identifies the specific relation.  relNode corresponds to
   * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
   * to assign new physical files to relations in some situations).
!  * Notice that relNode is only unique within a particular tablespace.
   *
   * Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
   * zero.  We support shared relations only in the global tablespace.
--- 55,62 
   * relNode identifies the specific relation.  relNode corresponds to
   * pg_class.relfilenode (NOT pg_class.oid, because we need to be able
   * to assign new physical files to relations in some situations).
!  * Notice that relNode is only unique within a database in a particular
!  * tablespace.
   *
   * Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is
   * zero.  We support shared relations only in the global tablespace.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_ctl status with nonexistent data directory

On Thu, Mar  6, 2014 at 12:17:55PM -0300, Alvaro Herrera wrote:
 Bruce Momjian escribió:
 
  Technically, you are right, but I tried a while ago to assign meaningful
  values to all the exit locations and the community feedback I got was
  that we didn't want that.
 
 That sounds odd.  Do you have a link?

Sure, the patch is here:

http://www.postgresql.org/message-id/20130629025033.gi13...@momjian.us

and the idea of keeping what we have is stated here:

http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC proposal - make an unlogged table logged

On Thu, Mar 6, 2014 at 5:04 PM, Robert Haas robertmh...@gmail.com wrote:

 On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas robertmh...@gmail.com
wrote:
  I think this isn't a good design.  Per the discussion between Andres
  and I, I think that I think you should do is make ALTER TABLE .. SET
  LOGGED work just like VACUUM FULL, with the exception that it will set
  a different relpersistence for the new relfilenode.  If you do it that
  way, this will be less efficient, but much simpler, and you might
  actually finish it in one summer.
 
 
  Do it like 'VACUUM FULL' for any wal_level?

 Yep.  Anything else appears to be a research problem.


Updated proposal:

proposal

** Add to PostgreSQL the capacity to making an Unlogged table Logged **

Introduction

This project will allow to change an unlogged table (that doesn't create
transaction logs - WAL files) and it's dependencies to a logged table, in
other words, a regular table that create WAL files. To make this happen
we'll introduce a new SQL syntax:

ALTER TABLE name SET LOGGED;


Benefits to the PostgreSQL Community

The  unlogged tables feature was introduced by 9.1 version, and provide
better write performance than regular tables (logged), but are not
crash-safe. Their contents are automatically discarded (cleared) in a case
of a server crash, and their contents do not propagate to replication
slaves, either.
With the capacity of turning an unlogged table in a logged table will
allow us have the better of two features, in other words, we can use an
unlogged table to run a bulk load a thousands of lines (ETL scripts) and
get better performance, and then change it to a logged table to get
durability of loaded data.


Deliverables

This project has just one deliverable at the end. The deliverable will be
the implementation of the routines that transform an unlogged table to
logged, using the same algorithm of the vacuum full, with the exception
that it will set a different relpersistence for the new relfilenode.


Project Schedule

until May 19:
* create a website to the project (wiki.postgresql.org)
* create a public repository to the project (github.com/fabriziomello)
* read what has already been discussed by the community about the project (
http://wiki.postgresql.org/wiki/Todo)
* as already discussed in pgsql-hackers mailing list this feature will be
implemented similar to vacuum full, with the exception that it will set a
differente relpersistence for the new relfilenode
* learn about some PostgreSQL internals:
  . grammar (src/backend/parser/gram.y)
  . vacuum full (src/backend/commands/[vacuum.c | cluster.c])

May 19 - June 23
* implementation of the first prototype:
  . change the grammar of PostgreSQL to support ALTER TABLE ... SET LOGGED
  . implement and/or adapt the routines to change an unlogged table to
logged (similar to vacuum full)
* write documentation and the test cases
* submit this first prototype to the commitfest 2014/06 (
https://commitfest.postgresql.org/action/commitfest_view?id=22)

June 23 - June 27
* review with the Mentor of the work done until now

June 27 - August 18
* do the adjustments based on the community feedback during the commitfest
2014/06
* submit to the commitfest 2014/09 for final evaluation and maybe will be
committed to 9.5 version (webpage don't created yet)

August 18 - August 22
* final review with the Mentor of all work done.


About the proponent

Fabrízio de Royes Mello
e-mail: fabriziome...@gmail.com
twitter: @fabriziomello
github: http://github.com/fabriziomello
linkedin: http://linkedin.com/in/fabriziomello

Currently I help people and teams to take the full potential of relational
databases, especially PostgreSQL, helping teams to design the structure of
the database (modeling), build physical architecture (database schema),
programming (procedural languages), SQL (usage, tuning, best practices),
optimization and orchestration of instances in production too. I perform a
volunteer work for Brazilian Community of PostgreSQL (www.postgresql.org.br),
supporting mailing lists, organizing events (pgbr.postgresql.org.br) and
some admin tasks. And also I help a little the PostgreSQL Global
Development Group (PGDG) in the implementation of some features and review
of patches (git.postgresql.org).

/proposal

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] GSoC proposal - make an unlogged table logged

=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
 Do you think is difficult to implement ALTER TABLE ... SET UNLOGGED too?
 Thinking in a scope of one GSoC, of course.

I think it's basically the same thing.  You might hope to optimize it;
but you have to create (rather than remove) an init fork, and there's
no way to do that in exact sync with the commit.  So for safety I think
you have to copy the data into a new relfilenode.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_ctl status with nonexistent data directory

2014-03-06 Thread Amit Kapila

On Thu, Mar 6, 2014 at 7:46 PM, Bruce Momjian br...@momjian.us wrote:
 On Thu, Mar  6, 2014 at 09:54:57AM +0530, Amit Kapila wrote:
  If they haven't passed us a data directory, we don't really know if the
  server is running or not, so the patch just returns '1'.

 But for such cases, isn't the status 4 more appropriate?
 As per above link 4 program or service status is unknown

 status 1 - 1 program is dead and /var/run pid file exists
 Going by this definition, it seems status 1 means, someone
 has forcefully killed the server and pid file still remains.

 Technically, you are right, but I tried a while ago to assign meaningful
 values to all the exit locations and the community feedback I got was
 that we didn't want that.  I don't see how specifying non-existant or
 non-cluster directory would somehow be a case that would be special.

One reason could be that as we are already returning special exit code
for 'status' option of pg_ctl (exit(3)), this seems to be inline with it as this
also get called during status command.

Also in the link sent by you in another mail, I could see some statement
which indicates it is more important to return correct codes in case of
status which sounds a good reason to me.

Link and statement
http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net

The status case is different, because there the exit code
can be passed out by the init script directly.

If we just want to handle correct exit codes for status option, then
may be we need to refactor the code a bit to ensure the same.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_ctl status with nonexistent data directory

Bruce Momjian br...@momjian.us writes:
 On Thu, Mar  6, 2014 at 12:17:55PM -0300, Alvaro Herrera wrote:
 Bruce Momjian escribió:
 Technically, you are right, but I tried a while ago to assign meaningful
 values to all the exit locations and the community feedback I got was
 that we didn't want that.

 That sounds odd.  Do you have a link?

 Sure, the patch is here:
   http://www.postgresql.org/message-id/20130629025033.gi13...@momjian.us
 and the idea of keeping what we have is stated here:
   http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net

Perhaps I shouldn't be putting words in Peter's mouth, but my reading of
his complaint was that he didn't think you'd mapped the pg_ctl failure
conditions to LSB status codes very well.  That's not necessarily a vote
against the abstract idea of making it more LSB-compliant.

But it seems like we might have to go through it case-by-case to argue out
what's the right error code for each case ... and I'm not sure anybody
thinks it's worth that much effort.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC proposal - make an unlogged table logged

On Fri, Mar 7, 2014 at 12:36 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com
writes:
  Do you think is difficult to implement ALTER TABLE ... SET UNLOGGED
too?
  Thinking in a scope of one GSoC, of course.

 I think it's basically the same thing.  You might hope to optimize it;
 but you have to create (rather than remove) an init fork, and there's
 no way to do that in exact sync with the commit.  So for safety I think
 you have to copy the data into a new relfilenode.


Well, the same thing that 'vacuum full' does, but changing relpersistence
to RELPERSISTENCE_UNLOGGED for the new relfilenode. Is this?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] pg_ctl status with nonexistent data directory

On Thu, Mar  6, 2014 at 10:43:01PM -0500, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Thu, Mar  6, 2014 at 12:17:55PM -0300, Alvaro Herrera wrote:
  Bruce Momjian escribi�:
  Technically, you are right, but I tried a while ago to assign meaningful
  values to all the exit locations and the community feedback I got was
  that we didn't want that.
 
  That sounds odd.  Do you have a link?
 
  Sure, the patch is here:
  http://www.postgresql.org/message-id/20130629025033.gi13...@momjian.us
  and the idea of keeping what we have is stated here:
  http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net
 
 Perhaps I shouldn't be putting words in Peter's mouth, but my reading of
 his complaint was that he didn't think you'd mapped the pg_ctl failure
 conditions to LSB status codes very well.  That's not necessarily a vote
 against the abstract idea of making it more LSB-compliant.
 
 But it seems like we might have to go through it case-by-case to argue out
 what's the right error code for each case ... and I'm not sure anybody
 thinks it's worth that much effort.

Yes, I think the question was whether the effort was worth it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_ctl status with nonexistent data directory

On Fri, Mar  7, 2014 at 09:07:24AM +0530, Amit Kapila wrote:
 One reason could be that as we are already returning special exit code
 for 'status' option of pg_ctl (exit(3)), this seems to be inline with it as 
 this
 also get called during status command.
 
 Also in the link sent by you in another mail, I could see some statement
 which indicates it is more important to return correct codes in case of
 status which sounds a good reason to me.
 
 Link and statement
 http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net
 
 The status case is different, because there the exit code
 can be passed out by the init script directly.
 
 If we just want to handle correct exit codes for status option, then
 may be we need to refactor the code a bit to ensure the same.

OK, done with the attached patch  Three is returned for status, but one
for other cases.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
new file mode 100644
index 0dbaa6e..8c9970d
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
*** static bool allow_core_files = false;
*** 97,102 
--- 97,103 
  static time_t start_time;
  
  static char postopts_file[MAXPGPATH];
+ static char version_file[MAXPGPATH];
  static char pid_file[MAXPGPATH];
  static char backup_file[MAXPGPATH];
  static char recovery_file[MAXPGPATH];
*** static void pgwin32_doRunAsService(void)
*** 152,158 
  static int	CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo, bool as_service);
  #endif
  
! static pgpid_t get_pgpid(void);
  static char **readfile(const char *path);
  static void free_readfile(char **optlines);
  static int	start_postmaster(void);
--- 153,159 
  static int	CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo, bool as_service);
  #endif
  
! static pgpid_t get_pgpid(bool is_status_request);
  static char **readfile(const char *path);
  static void free_readfile(char **optlines);
  static int	start_postmaster(void);
*** print_msg(const char *msg)
*** 246,255 
  }
  
  static pgpid_t
! get_pgpid(void)
  {
  	FILE	   *pidf;
  	long		pid;
  
  	pidf = fopen(pid_file, r);
  	if (pidf == NULL)
--- 247,275 
  }
  
  static pgpid_t
! get_pgpid(bool is_status_request)
  {
  	FILE	   *pidf;
  	long		pid;
+ 	struct stat statbuf;
+ 
+ 	if (stat(pg_data, statbuf) != 0)
+ 	{
+ 		if (errno == ENOENT)
+ 			printf(_(%s: directory \%s\ does not exist\n), progname,
+ 	 pg_data);
+ 		else
+ 			printf(_(%s: cannot access directory \%s\\n), progname,
+ 	 pg_data);
+ 		exit(is_status_request ? 3 : 1);
+ 	}
+ 
+ 	if (stat(version_file, statbuf) != 0  errno == ENOENT)
+ 	{
+ 		printf(_(%s: directory \%s\ is not a database cluster directory\n),
+  progname, pg_data);
+ 		exit(is_status_request ? 3 : 1);
+ 	}
  
  	pidf = fopen(pid_file, r);
  	if (pidf == NULL)
*** do_start(void)
*** 810,816 
  
  	if (ctl_command != RESTART_COMMAND)
  	{
! 		old_pid = get_pgpid();
  		if (old_pid != 0)
  			write_stderr(_(%s: another server might be running; 
  		   trying to start server anyway\n),
--- 830,836 
  
  	if (ctl_command != RESTART_COMMAND)
  	{
! 		old_pid = get_pgpid(false);
  		if (old_pid != 0)
  			write_stderr(_(%s: another server might be running; 
  		   trying to start server anyway\n),
*** do_stop(void)
*** 894,900 
  	pgpid_t		pid;
  	struct stat statbuf;
  
! 	pid = get_pgpid();
  
  	if (pid == 0)/* no pid file */
  	{
--- 914,920 
  	pgpid_t		pid;
  	struct stat statbuf;
  
! 	pid = get_pgpid(false);
  
  	if (pid == 0)/* no pid file */
  	{
*** do_stop(void)
*** 943,949 
  
  		for (cnt = 0; cnt  wait_seconds; cnt++)
  		{
! 			if ((pid = get_pgpid()) != 0)
  			{
  print_msg(.);
  pg_usleep(100);		/* 1 sec */
--- 963,969 
  
  		for (cnt = 0; cnt  wait_seconds; cnt++)
  		{
! 			if ((pid = get_pgpid(false)) != 0)
  			{
  print_msg(.);
  pg_usleep(100);		/* 1 sec */
*** do_restart(void)
*** 980,986 
  	pgpid_t		pid;
  	struct stat statbuf;
  
! 	pid = get_pgpid();
  
  	if (pid == 0)/* no pid file */
  	{
--- 1000,1006 
  	pgpid_t		pid;
  	struct stat statbuf;
  
! 	pid = get_pgpid(false);
  
  	if (pid == 0)/* no pid file */
  	{
*** do_restart(void)
*** 1033,1039 
  
  		for (cnt = 0; cnt  wait_seconds; cnt++)
  		{
! 			if ((pid = get_pgpid()) != 0)
  			{
  print_msg(.);
  pg_usleep(100);		/* 1 sec */
--- 1053,1059 
  
  		for (cnt = 0; cnt  wait_seconds; cnt++)
  		{
! 			if ((pid = get_pgpid(false)) != 0)
  			{
  print_msg(.);
  pg_usleep(100);		/* 1 sec */
*** do_reload(void)
*** 1071,1077 
  {
  	pgpid_t		pid;
  
! 	pid = get_pgpid();
  	if (pid == 0)/* no pid file */
  	{

Re: [HACKERS] jsonb and nested hstore

On Thu, Mar  6, 2014 at 09:50:56PM +0400, Oleg Bartunov wrote:
 Hi there,
 
 Looks like consensus is done. I and Teodor are not happy with it, but
 what we can do :)   One thing I  want to do is to reserve our
 contribution to the flagship feature (jsonb), particularly, binary
 storage for nested structures and indexing. Their work was sponsored
 by Engine Yard.

OK, if we are going with an unchanged hstore in contrib and a new JSONB,
there is no reason to wack around JSONB to be binary compatible with the
old hstore format.  What sacrifices did we need to make to have JSBONB
be binary compatible with hstore, can those sacrifices be removed, and
can that be done in time for 9.4?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

On Thu, Mar  6, 2014 at 04:33:08PM +0100, Ronan Dunklau wrote:
  I'm not sure what the constraints of json that you might want to break
  are. Perhaps you'd like to specify.
 
 I haven't followed the whole thread, but json is really restrictive on the 
 supported types: a hierarchical hstore could maybe support more types 
 (timestamp comes to mind) as its values, which is not a valid data type in 
 the 
 json spec.

Yes, I can see this as an idea for a new data type that allows
hierarchical storage of key/value pairs where the value can be any
Postgres data type.  It wouldn't be called hstore, or hstore2, but
something else.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-06 Thread Noah Misch

On Thu, Mar 06, 2014 at 12:44:34PM -0500, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  Thanks.  To avoid socket path length limitations, I lean toward placing the
  socket temporary directory under /tmp rather than placing under the CWD:
 
 I'm not thrilled with that; it's totally insecure on platforms where /tmp
 isn't sticky, so it doesn't seem like an appropriate solution given
 that this discussion is now being driven by security concerns.
 
  http://www.postgresql.org/message-id/flat/20121129223632.ga15...@tornado.leadboat.com
 
 I re-read that thread.  While we did fix the reporting end of it, ie
 the postmaster will now give you a clear failure message if your
 socket path is too long, that's going to be cold comfort to anyone
 who has to build in an environment they don't have much control over
 (such as my still-hypothetical-I-hope scenario about Red Hat package
 updates).
 
 I'm inclined to suggest that we should put the socket under $CWD by
 default, but provide some way for the user to override that choice.
 If they want to put it in /tmp, it's on their head as to how secure
 that is.  On most modern platforms it'd be fine.

I am skeptical about the value of protecting systems with non-sticky /tmp, but
long $CWD isn't of great importance, either.  I'm fine with your suggestion.
Though the $CWD or one of its parents could be world-writable, that would
typically mean an attacker could just replace the test cases directly.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] atexit_callback can be a net negative

Back in commit 249a899f7, we introduced an atexit callback that forced
backend process cleanup to occur if some random backend plugin
(I'm looking at you, plperl and plpython) executed exit().  While that
seemed like a great safety feature at the time, bug #9464
http://www.postgresql.org/message-id/flat/20140307005623.1918.22...@wrigleys.postgresql.org
shows that this can actually destabilize things rather than improve them.
The issue is that code that does fork() and then does an exit() in the
subprocess will run the backend's atexit callback, which is The Wrong
Thing since the parent backend is (probably) still running.

In the bug thread I proposed making atexit_callback check whether getpid()
still matches MyProcPid.  If it doesn't, then presumably we inherited the
atexit callback list, along with the value of MyProcPid, from some parent
backend process whose elbow we should not joggle.  Can anyone see a flaw
in that?

There's an interesting connection here to the existing coding rule that
child processes of the postmaster have to do on_exit_reset() immediately
after being forked to avoid running the postmaster's on_exit routines
unintentionally.  We've not seen any reported breakdowns in that scheme,
but I wonder if we ought to handle it differently, seeing as how people
keep coming up with new randomness that they want to load into the
postmaster.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] atexit_callback can be a net negative

2014-03-06 Thread Claudio Freire

On Fri, Mar 7, 2014 at 2:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 In the bug thread I proposed making atexit_callback check whether getpid()
 still matches MyProcPid.  If it doesn't, then presumably we inherited the
 atexit callback list, along with the value of MyProcPid, from some parent
 backend process whose elbow we should not joggle.  Can anyone see a flaw
 in that?

While my answer would be not really (lots of python libraries do the
same to handle forks), there's an optional path: pthread_atfork.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Securing make check (CVE-2014-0067)