[HACKERS] Review of Synchronous Replication patches

2010-07-24 Thread Yeb Havinga

Hello list,

My apologies if this email arrives more than once, I've been 
experiencing troubles posting to the -hackers list and am resending this 
review as new thread instead of a reply to an existing thread of  
http://archives.postgresql.org/pgsql-hackers/2010-07/msg01072.php which 
also has relevant links to both patches and discussion.


Here follows a severly time-boxed review of both synchronous replication
patches. Expect this review to be incomplete, though I believe the
general remarks to be valid. Off the list I've received word from Zoltan
that work on a new patch is planned. It would be great if ideas from
both patches could be merged into one.

regards,
Yeb Havinga


Patch A: Zoltán Böszörményi
Patch B: Fujii Masao

* Is the patch in context diff format?
A: Yes
B: Yes

* Does it apply cleanly to the current CVS HEAD?
A: No
B: Yes

* Does it include reasonable tests, necessary doc patches, etc?
A: Doc patches, and a contrib module for debugging purposes.
B: Doc patches.

For neither patch the documentation is final, see e.g.
* 25.2 - The section starting with It should be noted that the log
shipping is asynchronous, i.e., should be updated.
* 25.2.5 - Dito for Streaming replication is asynchronous, so there...

Testing any replication setup is not possible with the current
regression tool suite, and adding that is beyond the scope of any
synchronous replication patch. Perhaps the work of Markus Wanner with
dtester (that adds a make dcheck) can be assimilated for this purpose.

Both patches add synchronous replication to the current single-master
streaming replication features in 9.0, which means that there now is a
mechanism where a commit on the master does not finish until 'some or
more of the replicas are in sync'

* Does the patch actually implement that?
A: Yes
B: No, if no standbys are connected commits to not wait. (If I
understand the code from WaitXLogSend correct)

* Do we want that?
For database users who do never want to loose a single committed record,
synchronous replication is a relatively easy setup to achieve a high
level of 'security' (where secure means: less chance of losing data).
The easiness is relative to current solutions (the whole range of
mirrored disks, controllers, backups, log shipping etc).

* Do we already have it?
No

* Does it follow SQL spec, or the community-agreed behavior?
A: Unknown, though the choices in guc parameters suggest the patch's
been made to support actual use cases.
B: Design of patch B has been discussed on the mailing list as well as
the wiki (for links I refer to my previous mail)

* Are there dangers?
Besides the usual errors causing transactions to fail, in my opinion the
single largest danger would be that the master thinks a standby is in
sync, where in fact it isn't.

* Have all the bases been covered?
Patch A seems to cover two-phase commit where patch B does not. (I'm
time constrained and currently do not have a replicated cluster with
patch B anymore, so I cannot test).

# Does the feature work as advertised?
Patch A: yes
Patch B: yes

I ran a few tests with failures and had some recoverable problems, of
which I'm unsure they are related to the sync rep patches or streaming
replication in general. Work in the direction of a replication
regression test would be useful.

# Are there any assertion failures or crashes?
No

A note: reading source code of both patches makes it clear that these
are patches from experienced PostgreSQL programmers. The source code
reads just like 'normal' high quality postgresql source.

Differences:
Patch A synchronizes by sending back the Xids that have been received
and written to disk. When the master receives the xids, the respective
backends with having those xids active are unlocked and signalled to
continue. This means some locking of the procarray. The libpq protocol
was adapted so a client (the walreceiver) can report back the xids.
Patch B synchronizes by waiting to send wal data, until the sync
replicas have sending back LSN pointers in the wal files they have
synced to, in one of three ways. The libpq protocol was adapted so the
walreceiver can report back WAL file positions.

Perhaps the weakness of both patches is that they are not one. In my
opinion, both patches have their strengths. It would be great if these
strenght could be unified in a single patch.

patch A strengths:
* design of the guc parameters - meaning of
min_sync_replication_clients. As I understand it, it is not possible
with patch B to define a minimum number of synchronous standby servers a
transaction must be replicated to, before the commit succeeds. Perhaps
the name could be changed (quorum_min_sync_standbys?), but in my opinion
the definitive sync replication patch needs a parameter with this number
and exact meaning. The 'synchronous_slave' guc in boolean is also a good
one in my opinion.

patch B strengths:
* having the walsender synced by waiting for acked LSN, instead of
signalling respective backends with a 

Re: [HACKERS] Review of Synchronous Replication patches

2010-07-24 Thread zb
Hi,

 ... Off the list I've received word from Zoltan
 that work on a new patch is planned. It would be great if ideas from
 both patches could be merged into one.

...

 * Does it follow SQL spec, or the community-agreed behavior?
 A: Unknown, though the choices in guc parameters suggest the patch's
 been made to support actual use cases.
 B: Design of patch B has been discussed on the mailing list as well as
 the wiki (for links I refer to my previous mail)

 * Have all the bases been covered?
 Patch A seems to cover two-phase commit where patch B does not. (I'm
 time constrained and currently do not have a replicated cluster with
 patch B anymore, so I cannot test).

...

 Differences:
 Patch A synchronizes by sending back the Xids that have been received
 and written to disk. When the master receives the xids, the respective
 backends with having those xids active are unlocked and signalled to
 continue. This means some locking of the procarray. The libpq protocol
 was adapted so a client (the walreceiver) can report back the xids.
 Patch B synchronizes by waiting to send wal data, until the sync
 replicas have sending back LSN pointers in the wal files they have
 synced to, in one of three ways. The libpq protocol was adapted so the
 walreceiver can report back WAL file positions.

 Perhaps the weakness of both patches is that they are not one. In my
 opinion, both patches have their strengths. It would be great if these
 strenght could be unified in a single patch.

 patch A strengths:
 * design of the guc parameters - meaning of
 min_sync_replication_clients. As I understand it, it is not possible
 with patch B to define a minimum number of synchronous standby servers a
 transaction must be replicated to, before the commit succeeds. Perhaps
 the name could be changed (quorum_min_sync_standbys?), but in my opinion
 the definitive sync replication patch needs a parameter with this number
 and exact meaning. The 'synchronous_slave' guc in boolean is also a good
 one in my opinion.

 patch B strengths:
 * having the walsender synced by waiting for acked LSN, instead of
 signalling respective backends with a specific XID active, seems like a
 simpler and therefore better solution.
 * the three different ways when to sync (recv,flush and replay), and
 also that this is configurable per standby.

 In patch B there's probably some work to do in WaitXLogSend(), after a
 long day I'm having troubles understanding it's meaning but somehow I
 believe that a more intuitive set of guc parameters can be found,
 accompanied by clear code in this function. The way patch A synchronizes
 and count synchronous standbys for a commit is clearer.

This week I am on vacation but we discussed your review a little with
Hans-Jürgen Schönig and given the opinion and the consensus that sending
back LSNs is a better solution, I will withdraw my patch from the
commitfest page.

Instead, I will post a patch that unifies my configuration choices with
Fujii's patch. Do you have suggestions for better worded GUCs?
slave seems to be phased out by standby for political correctness, so
synchronous_standby instead of synchronous_slave. You mentioned
min_sync_replication_clients - quorum_min_sync_standbys. What else?

You also noticed that my patch addressed 2PC, maybe I will have to add
this part to Fujii's patch, too. Note: I haven't yet read his patch,
maybe working with LSNs instead of XIDs make this work automatically,
I don't know. We should definitely test.

Best regards,
Zoltán Böszörményi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE

2010-07-24 Thread Marko Tiikkaja

On 7/24/2010 1:43 AM, Robert Haas wrote:

On Fri, Jul 23, 2010 at 6:20 PM, Alvaro Herrera
alvhe...@commandprompt.com  wrote:

It seems like it's EXPLAIN ANALYZE that needs fixing.


I would suggest that if we're going to change this, we back-patch it
to 9.0 before beta4.


I did some digging and found the commit that changed the behaviour:
http://archives.postgresql.org/pgsql-committers/2004-09/msg00155.php and 
then later Tom fixed a bug in it: 
http://archives.postgresql.org/pgsql-committers/2005-10/msg00316.php


According to the latter commit, not updating the snapshot could be 
preferable for EXPLAIN ANALYZE, but I don't see why this is.  Maybe we 
should wait until we hear from Tom?



Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-24 Thread Guillaume Lelarge
Le 21/07/2010 09:53, Dave Page a écrit :
 On Tue, Jul 20, 2010 at 8:12 PM, Peter Eisentraut pete...@gmx.net wrote:
 My preference would be to stick to a style where we identify the
 committer using the author tag and note the patch author, reviewers,
 whether the committer made changes, etc. in the commit message.  A
 single author field doesn't feel like enough for our workflow, and
 having a mix of authors and committers in the author field seems like
 a mess.

 Well, I had looked forward to actually putting the real author into the
 author field.
 
 I hadn't realised that was possible until Guillaume did so on his
 first commit to the new pgAdmin GIT repo. It seems to work nicely:
 
 http://git.postgresql.org/gitweb?p=pgadmin3.git;a=commit;h=08e2826d90129bd4e4b3b7462bab682dd6a703e4
 

It's one of the nice things with git. So, I'm eager to use it with the
pgAdmin repo.


-- 
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Patch for phypot - Pygmy Hippotause

2010-07-24 Thread Andrew Geery
It looks good to me: (0) new patch applies cleanly to CVS HEAD; (1)
the formating of the code was changed; (2) definition of the HYPOT
macro was changed to use phypot rather than being removed; (3) the
phypot function was declared to be extern; (4) the comments to the
phypot function were changed to remove the reference about the SUS
behavior.

I changed the status to Ready for Committer.

Thanks
Andrew

On Fri, Jul 23, 2010 at 4:01 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:

 I think the patch is good in principle

 Since everyone seems to agree this is a good patch which needed
 minor tweaks, and we haven't heard from the author, I just went
 ahead and made the changes.

 Andrew, could you take another look and see if you agree I've
 covered it all before it gets marked ready for a committer?

 -Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review of Synchronous Replication patches

2010-07-24 Thread Yeb Havinga

Hello Zoltán,

Thanks for your reply!

Instead, I will post a patch that unifies my configuration choices with
Fujii's patch.
Please know that Fujii's patch is also a work in progress. I didn't 
mention in my review the previously discussed items, most important the 
changing the polling loops (see e.g. 
http://archives.postgresql.org/pgsql-hackers/2010-07/msg00757.php).

Do you have suggestions for better worded GUCs?
slave seems to be phased out by standby for political correctness, so
synchronous_standby instead of synchronous_slave. You mentioned
min_sync_replication_clients - quorum_min_sync_standbys. What else?
  
The 'quorum_min_sync_standbys' came from a reply to the design of 
Fujii's patch on 
http://archives.postgresql.org/pgsql-hackers/2010-07/msg01167.php. 
However after thinking about it a bit more, I still fail to see the use 
case for a maximum quorum number. Having a 'max quorum' also seems to 
contradict common meanings of 'quorum', which in itself is a minimum, 
minimum number, least required number, lower limit.


Having also sync in the name is useful, imho, so people know it's not 
about async servers. So then the name would become quorum_sync_standbys 
or synchronous_standby_quorum.


Though I think I wouldn't use 'strict_sync_replication' (explained in  
http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php) I can 
imagine others want this feature, to not have the master halt by sync 
standby failure. If that's what somebody never want, than the way this 
is solved by this parameter is elegant: only wait if they are connected.


In recovery.conf, a boolean to discern between sync and async servers 
(like in your patch) is IMHO better than mixing 'sync or async' with the 
replication modes. Together with the replication modes, this could then 
become

synchronous_standby (boolean)
synchronous_mode (recv,fsync,replay)


You also noticed that my patch addressed 2PC, maybe I will have to add
this part to Fujii's patch, too. Note: I haven't yet read his patch,
maybe working with LSNs instead of XIDs make this work automatically,
I don't know.
Yes, and I think we definately need automated replicated cluster tests. 
A large part of reviewing went into virtual machine setup and cluster 
setup. I'm not sure if a full test suite that includes node failures 
could or should be included in the core regression test, but anything 
that automates setup, a few transactions and failures would benefit 
everyone working and testing on replication.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Functional dependencies and GROUP BY

2010-07-24 Thread Peter Eisentraut
On fre, 2010-07-23 at 11:00 -0600, Alex Hunsaker wrote:
 I just read that patch is getting pushed till at least the next commit
 fest: http://archives.postgresql.org/pgsql-hackers/2010-07/msg01219.php
 
 Should we push this patch back to?  Alternatively we could make it
 work with just primary keys until the other patch gets in.  I think
 that makes sense, find that attached.  Thoughts?

I was thinking the same thing.

 Note I axed the index not null attribute checking, I have not thought
 to deeply about it other than if its a primary key it cant have non
 null attributes Right?  I may have missed something subtle hence
 the heads up.

Another open question I thought of was whether we should put the
dependency record on the pg_index row, or the pg_constraint row, or
perhaps the pg_class row.  Right now, it is using pg_index, because that
was easiest to code up, but I suspect that once we have not-null
constraints in pg_constraint, it will be more consistent to make all
dependencies go against pg_constraint rather than a mix of several
catalogs.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review of Synchronous Replication patches

2010-07-24 Thread zb
 Hello Zoltán,

 Thanks for your reply!
 Instead, I will post a patch that unifies my configuration choices with
 Fujii's patch.
 Please know that Fujii's patch is also a work in progress.

Yes, I know that. But working from Fujii's last public patch
or from his GIT tree makes my patch easier to merge.

 I didn't
 mention in my review the previously discussed items, most important the
 changing the polling loops (see e.g.
 http://archives.postgresql.org/pgsql-hackers/2010-07/msg00757.php).

 Do you have suggestions for better worded GUCs?
 slave seems to be phased out by standby for political correctness,
 so
 synchronous_standby instead of synchronous_slave. You mentioned
 min_sync_replication_clients - quorum_min_sync_standbys. What else?

 The 'quorum_min_sync_standbys' came from a reply to the design of
 Fujii's patch on
 http://archives.postgresql.org/pgsql-hackers/2010-07/msg01167.php.
 However after thinking about it a bit more, I still fail to see the use
 case for a maximum quorum number. Having a 'max quorum' also seems to
 contradict common meanings of 'quorum', which in itself is a minimum,
 minimum number, least required number, lower limit.

 Having also sync in the name is useful, imho, so people know it's not
 about async servers. So then the name would become quorum_sync_standbys
 or synchronous_standby_quorum.

Ok.

 Though I think I wouldn't use 'strict_sync_replication' (explained in
 http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php) I can
 imagine others want this feature, to not have the master halt by sync
 standby failure. If that's what somebody never want, than the way this
 is solved by this parameter is elegant: only wait if they are connected.

Thanks, I was thinking about possible use cases and different preferences
of people.

 In recovery.conf, a boolean to discern between sync and async servers
 (like in your patch) is IMHO better than mixing 'sync or async' with the
 replication modes. Together with the replication modes, this could then
 become
 synchronous_standby (boolean)
 synchronous_mode (recv,fsync,replay)

Ok.

 You also noticed that my patch addressed 2PC, maybe I will have to add
 this part to Fujii's patch, too. Note: I haven't yet read his patch,
 maybe working with LSNs instead of XIDs make this work automatically,
 I don't know.
 Yes, and I think we definately need automated replicated cluster tests.
 A large part of reviewing went into virtual machine setup and cluster
 setup. I'm not sure if a full test suite that includes node failures
 could or should be included in the core regression test, but anything
 that automates setup, a few transactions and failures would benefit
 everyone working and testing on replication.

 regards,
 Yeb Havinga



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] TwoPO: experimental join order algorithm

2010-07-24 Thread Adriano Lange
Hi,

I'd like to release the last version of my experimental join order
algorithm (TwoPO - Two Phase Optimization [1]):

http://git.c3sl.ufpr.br/gitweb?p=lbd/ljqo.git;a=summary

This algorithm is not production-ready, but an experimental set of
ideas, which need to be refined and evaluated. As the join order
optimization is a hard problem, the evaluation of a search strategy is
also a hard task. Therefore, I think the most important TODO item
related to replacement of GEQO algorithm is to define a set of
evaluation criteria considered as relevant.

TwoPO is encapsulated in a plug-in called LJQO (Large Join Query
Optimization [2]). This plug-in has two main GUC variables:

ljqo_threshold = N (like geqo_threshold)
ljqo_algorithm = {twopo|geqo}

As its name means, TwoPO has internally two search strategies that
constitute its two phases of optimization:

 * Iterative Improvement (II) – Biased Sampling + Local Search
 * Simulated Annealing (SA)

This algorithm also works in two search spaces:

 * deep-trees (subset of bushy-trees)
- list of baserels
- initial states: biased sampling using
  Prim algorithm over join graph (new, very efficient)
- moves: swap, 3cycle [2]
 * bushy-trees
- binary join tree representation
- initial states: biased sampling using
  Kruskal's algorithm over join graph [3,4].
- moves: associative

You can modify the functionality of TwoPO through the following parameters:

twopo_bushy_space = {true|false}
  - set it to false if you want only deep-trees
default=true
twopo_heuristic_states = {true|false}
  - enables heuristic to initial states
default=true
twopo_ii_stop = Int
  - number of initial states
default=10
twopo_ii_improve_states = {true|false}
  - find local-minimum of each initial state
default=true
twopo_sa_phase = {true|false}
  - enables Simulated Annealing (SA) phase
default=true
twopo_sa_initial_temperature = Float
  - initial temperature for SA phase
default=0.1
twopo_sa_temperature_reduction = Float
  - temperature reduction
default=0.95
twopo_sa_equilibrium = Int
  - number of states generated for each temperature
(Int * State Size)
default=16


References:

[1] Yannis E. Ioannidis e Younkyung Kang. Randomized algorithms for
optimizing large join queries. SIGMOD Rec., 19(2):312-321, 1990.

[2] Arun Swami e Anoop Gupta. Optimization of large join queries. SIGMOD
'88: Proceedings of the 1988 ACM SIGMOD international conference on
Management of data, pages 8-17, New York, NY, USA, 1988. ACM.

[3] P.B. Guttoski, M. S. Sunye, e F. Silva. Kruskal's algorithm for
query tree optimization. IDEAS '07: Proceedings of the 11th
International Database Engineering and Applications Symposium, pages
296-302, Washington, DC, USA, 2007. IEEE Computer Society.

[4] Florian Waas e Arjan Pellenkoft. Join order selection - good enough
is easy. BNCOD, pages 51-67, 2000.


Att,
Adriano Lange


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-24 Thread Ron Mayer
Robert Haas wrote:
 
 If git had a place to store all the information we care about, that
 would be fine...
 
 There's no reviewer header, and there's no concept that a patch
 might have come from the author (or perhaps multiple authors), but
 then have been adjusted by one or more reviewers and then frobnicated
 some more by the committer
 ...
 I don't think that non-linear history is an advantage in any
 situation.

ISTM we could have the best of both those worlds - linear history
and authorreviewercommitter information.

Instead of squashing every patch into a single commit, what if it got
squashed into a perhaps 3 separate commits -- one as submitted, one
as reviewed, and one as re-written by the committer.  History stays
linear; and you keep the most relevant parts of the history,
while dropping all the development brainstorming commits.

And ISTM the patch reviewer could be responsible for this squashing
so it's not much more work for the committer.

For example, instead of

commit 351c0b92eca40c1a36934cf83fe75db9dc458cde
Author: Robert Haas robertmh...@gmail.com
Date:   Fri Jul 23 00:43:00 2010 +
Avoid deep recursion when assigning XIDs to multiple levels of subxacts.
Andres Freund, with cleanup and adjustment for older branches by me.

we'd see

Author: Andreas Freund
Date:   Fri Jul 23 00:43:00 2010 +
Avoid deep recursion when assigning XIDs to multiple levels of subxacts.
Path as originally submitted to commit fest

Author: [Whomever the reviewer was]
Date:   Fri Jul 23 00:43:00 2010 +
Avoid deep recursion when assigning XIDs to multiple levels of subxacts.
Patch marked read for committer by reviewer.

Author: Robert Haas robertmh...@gmail.com
Date:   Fri Jul 23 00:43:00 2010 +
Avoid deep recursion when assigning XIDs to multiple levels of subxacts.
Patch as rewritten by committer.

For a complex enough patch with many authors, the reviewer could choose to
keep an extra author commit or two to credit the extra authors.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bg worker: overview

2010-07-24 Thread Markus Wanner

Hi,

On 07/23/2010 09:45 PM, Dimitri Fontaine wrote:

Yeah, I guess user daemons would have to be workers, not plugins you
want to load into the coordinator.


Okay.


On the other side, the background workers have a connection to exactly
one database. They are supposed to do work on that database.


Is that because of the way backends are started, and to avoid having to
fork new ones too often?


For one, yes, I want to avoid having to start ones too often. I did look 
into letting these background workers switch the database connection, 
but that turned out not to be worth the effort.


Would you prefer a background worker that's not connected to a database, 
or why are you asking?



The background workers can easily load external libraries - just as a
normal backend can with LOAD. That would also provide better
encapsulation (i.e. an error would only tear down that backend, not the
coordinator). You'd certainly have to communicate between the
coordinator and the background worker. I'm not sure how match that fits
your use case.


Pretty well I think.


Go ahead, re-use the background workers. That's what I've published them 
for ;-)



Yeah. The connection pool is better outside of code. Let's think PGQ and
a internal task scheduler first, if we think at any generalisation.


To be honest, I still don't quite grok the concept behind PGQ. So I 
cannot really comment on this.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bg worker: overview

2010-07-24 Thread Dimitri Fontaine
Markus Wanner mar...@bluegap.ch writes:
 For one, yes, I want to avoid having to start ones too often. I did look
 into letting these background workers switch the database connection, but
 that turned out not to be worth the effort.

 Would you prefer a background worker that's not connected to a database, or
 why are you asking?

Trying to figure out how it would fit the PGQ and pgagent needs. But
maybe user defined daemons should be sub-coordinators (I used to think
about them as supervisors) able to talk to the coordinator to get a
backend connected to some given database and distribute work to it.

You're using iMessage as the data exchange, how are you doing the work
distribution? What do you use to tell the backend what is the processing
you're interrested into?

 Go ahead, re-use the background workers. That's what I've published
 them for

Hehe :) The aim of this thread would be to have your input as far as
designing an API would go, now that we're about on track as to what the
aim is.

 To be honest, I still don't quite grok the concept behind PGQ. So I cannot
 really comment on this.

In very short, the idea is a clock that ticks and associate
current_txid() to now(), so that you're able to say give me 3s worth of
transactions activity from this queue. It then provides facilities to
organise a queue into batches at consumer request, and for more details,
see there:

  http://github.com/markokr/skytools-dev/blob/master/sql/ticker/pgqd.c
  http://github.com/markokr/skytools-dev/blob/master/sql/ticker/ticker.c

But the important thing as far as making it a child of the coordinator
goes would be, I guess, that it's some C code running as a deamon and
running SQL queries from time to time. The SQL queries are calling C
user defined functions, provided by the PGQ backend module.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-24 Thread Peter Eisentraut
On lör, 2010-07-24 at 07:02 -0700, Ron Mayer wrote:
 Instead of squashing every patch into a single commit, what if it got
 squashed into a perhaps 3 separate commits -- one as submitted, one
 as reviewed, and one as re-written by the committer.  History stays
 linear; and you keep the most relevant parts of the history,
 while dropping all the development brainstorming commits.

But then there would be commits in the main repository that were known
not good enough.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-24 Thread Andrew Dunstan



Peter Eisentraut wrote:

On lör, 2010-07-24 at 07:02 -0700, Ron Mayer wrote:
  

Instead of squashing every patch into a single commit, what if it got
squashed into a perhaps 3 separate commits -- one as submitted, one
as reviewed, and one as re-written by the committer.  History stays
linear; and you keep the most relevant parts of the history,
while dropping all the development brainstorming commits.



But then there would be commits in the main repository that were known
not good enough.

  


Yeah. Also, please bear in mind that our explicit aim here is to make 
this change with a minimal disruption to existing work flows. So to all 
those people who want to say Look, you can now do all these cool 
things my answer is Maybe we'll do them later, but not right now.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-24 Thread Joshua D. Drake
On Sat, 2010-07-24 at 13:48 -0400, Andrew Dunstan wrote:
 

 Yeah. Also, please bear in mind that our explicit aim here is to make 
 this change with a minimal disruption to existing work flows. So to all 
 those people who want to say Look, you can now do all these cool 
 things my answer is Maybe we'll do them later, but not right now.

Amen brother. Git is a universe different than CVS/SVN. Let's practice
KISS for a while shall we.

JD


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-07-24 Thread Mike Fowler

On 21/07/10 08:33, Mike Fowler wrote:


Why is the first argument AexprConst instead of a_expr? The SQL
standard says it's a character string literal, but I think we can very
well allow arbitrary expressions.


Yes, it was AexprConst because of the specification. I also found that
using it solved my shift/reduce problems, but I can change it a_expr as
see if I can work them out in a different way.


[snip]


Why c_expr?


As with the AexprConst, it's choice was partially influenced by the fact
it solved the shift/reduce errors I was getting. I'm guessing than that
I should really use a_expr and resolve the shift/reduce problem
differently?



Attached is the revised version of the patch addressing all the issues 
raised in the review, except for the use of AexprConst and c_expr. With 
my limited knowledge of bison I've failed to resolve the shift/reduce 
errors that are introduced by using a_expr. I'm open to suggestions as 
my desk is getting annoyed with me beating it in frustration!


Thanks again for taking the time to review my work.

Regards,

--
Mike Fowler
Registered Linux user: 379787
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8554,8562  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
  ]]/screen
  /para
 /sect3
  
!sect3
  titleXML Predicates/title
  
  indexterm
   primaryIS DOCUMENT/primary
--- 8554,8570 
  ]]/screen
  /para
 /sect3
+/sect2
  
!sect2
  titleXML Predicates/title
+ 
+ indexterm
+  primaryXML Predicates/primary
+ /indexterm
+  
+sect3
+ titleIS DOCUMENT/title
  
  indexterm
   primaryIS DOCUMENT/primary
***
*** 8574,8579  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
--- 8582,8619 
   between documents and content fragments.
  /para
 /sect3
+
+sect3
+ titleXMLEXISTS/title
+ 
+ indexterm
+  primaryXMLEXISTS/primary
+ /indexterm
+ 
+ synopsis
+ functionXMLEXISTS/function(replaceablexpath/replaceable optionalPASSING BY REF replaceablexml/replaceable optionalBY REF/optional/optional)
+ /synopsis
+ 
+ para
+  The function functionxmlexists/function returns true if the replaceablexpath/replaceable 
+  returns any nodes and false otherwise. If no replaceablexml/replaceable is passed, the function
+  will return false as a XPath cannot be evaluated without content. See the
+  ulink url=http://www.w3.org/TR/xpath/#section-Introduction;W3C recommendation 'XML Path Language'/ulink
+  for a detailed explanation of why.
+ /para
+ 
+ para
+  Example:
+  screen![CDATA[
+ SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF 'townstownToronto/towntownOttawa/town/towns');
+ 
+  xmlexists
+ 
+  t
+ (1 row)
+ ]]/screen
+ /para
+/sect3
/sect2
  
sect2 id=functions-xml-processing
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***
*** 423,431  static TypeName *TableFuncTypeName(List *columns);
  %type list	opt_check_option
  
  %type target	xml_attribute_el
! %type list	xml_attribute_list xml_attributes
  %type node	xml_root_version opt_xml_root_standalone
! %type ival	document_or_content
  %type boolean xml_whitespace_option
  
  %type node 	common_table_expr
--- 423,432 
  %type list	opt_check_option
  
  %type target	xml_attribute_el
! %type list	xml_attribute_list xml_attributes xmlexists_list
  %type node	xml_root_version opt_xml_root_standalone
! %type node	xmlexists_query_argument_list
! %type ival	document_or_content 
  %type boolean xml_whitespace_option
  
  %type node 	common_table_expr
***
*** 511,523  static TypeName *TableFuncTypeName(List *columns);
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
  	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
! 	PARSER PARTIAL PARTITION PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
  	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
  	QUOTE
  
! 	RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX
  	RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
  	RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
--- 512,524 
  	OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR
  	ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER
  
! 	PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION
  	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
  	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
  	QUOTE
  
! 	RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REINDEX
  	RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART
  	RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE
  
***
*** 539,545  static TypeName *TableFuncTypeName(List *columns);
  
  	WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE
  

Re: [HACKERS] patch: Add JSON datatype to PostgreSQL (GSoC, WIP)

2010-07-24 Thread Joseph Adams
Update: I'm in the middle of cleaning up the JSON code (
http://git.postgresql.org/gitweb?p=json-datatype.git;a=summary if you
want to see the very latest ), so I haven't addressed all of the major
problems with it yet.

On Fri, Jul 23, 2010 at 2:34 PM, Robert Haas robertmh...@gmail.com wrote:
 - I was under the impression that we wanted EXPLAIN (FORMAT JSON) to
 return type json, but that's obviously not going to be possible if all
 of this is contrib.  We could (a) move it all into core, (b) move the
 type itself and its input and output functions into core and leave the
 rest in contrib [or something along those lines], or (c) give up using
 it as the return type for EXPLAIN (FORMAT JSON).

I've been developing it as a contrib module because:
* I'd imagine it's easier than developing it as a built-in datatype
right away (e.g. editing a .sql.in file versus editing pg_type.h ).
* As a module, it has PGXS support, so people can try it out right
away rather than having to recompile PostgreSQL.

I, for one, think it would be great if the JSON datatype were all in
core :-)  However, if and how much JSON code should go into core is up
for discussion.  Thoughts, anyone?

A particularly useful aspect of the JSON support is the ability to
convert PostgreSQL arrays to JSON arrays (using to_json ), as there
currently isn't any streamlined way to parse arrays in the PostgreSQL
format client-side (that I know of).


Joey Adams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: Add JSON datatype to PostgreSQL (GSoC, WIP)

2010-07-24 Thread Andres Freund
On Sat, Jul 24, 2010 at 06:57:18PM -0400, Joseph Adams wrote:
 A particularly useful aspect of the JSON support is the ability to
 convert PostgreSQL arrays to JSON arrays (using to_json ), as there
 currently isn't any streamlined way to parse arrays in the PostgreSQL
 format client-side (that I know of).
I really would like to address the latter issue some day. I don't know
how many broken implementations I have seen in my, not that long, time
using pg, but it sure is 10+. I also knowingly have written dumbed
down versions.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [JDBC] Trouble with COPY IN

2010-07-24 Thread James William Pye
On Jul 23, 2010, at 7:11 AM, Tom Lane wrote:
 I can't help thinking that the JDBC driver must be being overly cute
 if this breaks it ...

I was wondering the same thing when I first saw Kris' message. However, iff I 
understand what JDBC is trying to achieve, I don't think I would call it 
overly.

@Kris

Is this a problem because JDBC is trying to detect failures as early as 
possible during a COPY IN?
Or, is it just JDBC's normal MO to always be reading?


Well, I've wanted to do the former (early error detection) with py-postgresql's 
COPY support, and I imagine getting a read event marking completion prior to 
emitting done/fail could be a snag.

cheers, jwp
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] patch: utf8_to_unicode (trivial)

2010-07-24 Thread Joseph Adams
In src/include/mb/pg_wchar.h , there is a function unicode_to_utf8 ,
but no corresponding utf8_to_unicode .  However, there is a static
function called utf2ucs that does what utf8_to_unicode would do.

I'd like this function to be available because the JSON code needs to
convert UTF-8 to and from Unicode codepoints, and I'm currently using
a separate UTF-8 to codepoint function for that.

This patch renames utf2ucs to utf8_to_unicode and makes it public.  It
also fixes the version of utf2ucs in  src/bin/psql/mbprint.c so that
it's equivalent to the one in wchar.c .

This is a patch against CVS HEAD for application.  It compiles and
tests successfully.

Comments?  Thanks,


Joey Adams


utf8-to-unicode.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers