[HACKERS] Review of Synchronous Replication patches
Hello list, My apologies if this email arrives more than once, I've been experiencing troubles posting to the -hackers list and am resending this review as new thread instead of a reply to an existing thread of http://archives.postgresql.org/pgsql-hackers/2010-07/msg01072.php which also has relevant links to both patches and discussion. Here follows a severly time-boxed review of both synchronous replication patches. Expect this review to be incomplete, though I believe the general remarks to be valid. Off the list I've received word from Zoltan that work on a new patch is planned. It would be great if ideas from both patches could be merged into one. regards, Yeb Havinga Patch A: Zoltán Böszörményi Patch B: Fujii Masao * Is the patch in context diff format? A: Yes B: Yes * Does it apply cleanly to the current CVS HEAD? A: No B: Yes * Does it include reasonable tests, necessary doc patches, etc? A: Doc patches, and a contrib module for debugging purposes. B: Doc patches. For neither patch the documentation is final, see e.g. * 25.2 - The section starting with It should be noted that the log shipping is asynchronous, i.e., should be updated. * 25.2.5 - Dito for Streaming replication is asynchronous, so there... Testing any replication setup is not possible with the current regression tool suite, and adding that is beyond the scope of any synchronous replication patch. Perhaps the work of Markus Wanner with dtester (that adds a make dcheck) can be assimilated for this purpose. Both patches add synchronous replication to the current single-master streaming replication features in 9.0, which means that there now is a mechanism where a commit on the master does not finish until 'some or more of the replicas are in sync' * Does the patch actually implement that? A: Yes B: No, if no standbys are connected commits to not wait. (If I understand the code from WaitXLogSend correct) * Do we want that? For database users who do never want to loose a single committed record, synchronous replication is a relatively easy setup to achieve a high level of 'security' (where secure means: less chance of losing data). The easiness is relative to current solutions (the whole range of mirrored disks, controllers, backups, log shipping etc). * Do we already have it? No * Does it follow SQL spec, or the community-agreed behavior? A: Unknown, though the choices in guc parameters suggest the patch's been made to support actual use cases. B: Design of patch B has been discussed on the mailing list as well as the wiki (for links I refer to my previous mail) * Are there dangers? Besides the usual errors causing transactions to fail, in my opinion the single largest danger would be that the master thinks a standby is in sync, where in fact it isn't. * Have all the bases been covered? Patch A seems to cover two-phase commit where patch B does not. (I'm time constrained and currently do not have a replicated cluster with patch B anymore, so I cannot test). # Does the feature work as advertised? Patch A: yes Patch B: yes I ran a few tests with failures and had some recoverable problems, of which I'm unsure they are related to the sync rep patches or streaming replication in general. Work in the direction of a replication regression test would be useful. # Are there any assertion failures or crashes? No A note: reading source code of both patches makes it clear that these are patches from experienced PostgreSQL programmers. The source code reads just like 'normal' high quality postgresql source. Differences: Patch A synchronizes by sending back the Xids that have been received and written to disk. When the master receives the xids, the respective backends with having those xids active are unlocked and signalled to continue. This means some locking of the procarray. The libpq protocol was adapted so a client (the walreceiver) can report back the xids. Patch B synchronizes by waiting to send wal data, until the sync replicas have sending back LSN pointers in the wal files they have synced to, in one of three ways. The libpq protocol was adapted so the walreceiver can report back WAL file positions. Perhaps the weakness of both patches is that they are not one. In my opinion, both patches have their strengths. It would be great if these strenght could be unified in a single patch. patch A strengths: * design of the guc parameters - meaning of min_sync_replication_clients. As I understand it, it is not possible with patch B to define a minimum number of synchronous standby servers a transaction must be replicated to, before the commit succeeds. Perhaps the name could be changed (quorum_min_sync_standbys?), but in my opinion the definitive sync replication patch needs a parameter with this number and exact meaning. The 'synchronous_slave' guc in boolean is also a good one in my opinion. patch B strengths: * having the walsender synced by waiting for acked LSN, instead of signalling respective backends with a
Re: [HACKERS] Review of Synchronous Replication patches
Hi, ... Off the list I've received word from Zoltan that work on a new patch is planned. It would be great if ideas from both patches could be merged into one. ... * Does it follow SQL spec, or the community-agreed behavior? A: Unknown, though the choices in guc parameters suggest the patch's been made to support actual use cases. B: Design of patch B has been discussed on the mailing list as well as the wiki (for links I refer to my previous mail) * Have all the bases been covered? Patch A seems to cover two-phase commit where patch B does not. (I'm time constrained and currently do not have a replicated cluster with patch B anymore, so I cannot test). ... Differences: Patch A synchronizes by sending back the Xids that have been received and written to disk. When the master receives the xids, the respective backends with having those xids active are unlocked and signalled to continue. This means some locking of the procarray. The libpq protocol was adapted so a client (the walreceiver) can report back the xids. Patch B synchronizes by waiting to send wal data, until the sync replicas have sending back LSN pointers in the wal files they have synced to, in one of three ways. The libpq protocol was adapted so the walreceiver can report back WAL file positions. Perhaps the weakness of both patches is that they are not one. In my opinion, both patches have their strengths. It would be great if these strenght could be unified in a single patch. patch A strengths: * design of the guc parameters - meaning of min_sync_replication_clients. As I understand it, it is not possible with patch B to define a minimum number of synchronous standby servers a transaction must be replicated to, before the commit succeeds. Perhaps the name could be changed (quorum_min_sync_standbys?), but in my opinion the definitive sync replication patch needs a parameter with this number and exact meaning. The 'synchronous_slave' guc in boolean is also a good one in my opinion. patch B strengths: * having the walsender synced by waiting for acked LSN, instead of signalling respective backends with a specific XID active, seems like a simpler and therefore better solution. * the three different ways when to sync (recv,flush and replay), and also that this is configurable per standby. In patch B there's probably some work to do in WaitXLogSend(), after a long day I'm having troubles understanding it's meaning but somehow I believe that a more intuitive set of guc parameters can be found, accompanied by clear code in this function. The way patch A synchronizes and count synchronous standbys for a commit is clearer. This week I am on vacation but we discussed your review a little with Hans-Jürgen Schönig and given the opinion and the consensus that sending back LSNs is a better solution, I will withdraw my patch from the commitfest page. Instead, I will post a patch that unifies my configuration choices with Fujii's patch. Do you have suggestions for better worded GUCs? slave seems to be phased out by standby for political correctness, so synchronous_standby instead of synchronous_slave. You mentioned min_sync_replication_clients - quorum_min_sync_standbys. What else? You also noticed that my patch addressed 2PC, maybe I will have to add this part to Fujii's patch, too. Note: I haven't yet read his patch, maybe working with LSNs instead of XIDs make this work automatically, I don't know. We should definitely test. Best regards, Zoltán Böszörményi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On 7/24/2010 1:43 AM, Robert Haas wrote: On Fri, Jul 23, 2010 at 6:20 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: It seems like it's EXPLAIN ANALYZE that needs fixing. I would suggest that if we're going to change this, we back-patch it to 9.0 before beta4. I did some digging and found the commit that changed the behaviour: http://archives.postgresql.org/pgsql-committers/2004-09/msg00155.php and then later Tom fixed a bug in it: http://archives.postgresql.org/pgsql-committers/2005-10/msg00316.php According to the latter commit, not updating the snapshot could be preferable for EXPLAIN ANALYZE, but I don't see why this is. Maybe we should wait until we hear from Tom? Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
Le 21/07/2010 09:53, Dave Page a écrit : On Tue, Jul 20, 2010 at 8:12 PM, Peter Eisentraut pete...@gmx.net wrote: My preference would be to stick to a style where we identify the committer using the author tag and note the patch author, reviewers, whether the committer made changes, etc. in the commit message. A single author field doesn't feel like enough for our workflow, and having a mix of authors and committers in the author field seems like a mess. Well, I had looked forward to actually putting the real author into the author field. I hadn't realised that was possible until Guillaume did so on his first commit to the new pgAdmin GIT repo. It seems to work nicely: http://git.postgresql.org/gitweb?p=pgadmin3.git;a=commit;h=08e2826d90129bd4e4b3b7462bab682dd6a703e4 It's one of the nice things with git. So, I'm eager to use it with the pgAdmin repo. -- Guillaume http://www.postgresql.fr http://dalibo.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Patch for phypot - Pygmy Hippotause
It looks good to me: (0) new patch applies cleanly to CVS HEAD; (1) the formating of the code was changed; (2) definition of the HYPOT macro was changed to use phypot rather than being removed; (3) the phypot function was declared to be extern; (4) the comments to the phypot function were changed to remove the reference about the SUS behavior. I changed the status to Ready for Committer. Thanks Andrew On Fri, Jul 23, 2010 at 4:01 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Tom Lane t...@sss.pgh.pa.us wrote: I think the patch is good in principle Since everyone seems to agree this is a good patch which needed minor tweaks, and we haven't heard from the author, I just went ahead and made the changes. Andrew, could you take another look and see if you agree I've covered it all before it gets marked ready for a committer? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of Synchronous Replication patches
Hello Zoltán, Thanks for your reply! Instead, I will post a patch that unifies my configuration choices with Fujii's patch. Please know that Fujii's patch is also a work in progress. I didn't mention in my review the previously discussed items, most important the changing the polling loops (see e.g. http://archives.postgresql.org/pgsql-hackers/2010-07/msg00757.php). Do you have suggestions for better worded GUCs? slave seems to be phased out by standby for political correctness, so synchronous_standby instead of synchronous_slave. You mentioned min_sync_replication_clients - quorum_min_sync_standbys. What else? The 'quorum_min_sync_standbys' came from a reply to the design of Fujii's patch on http://archives.postgresql.org/pgsql-hackers/2010-07/msg01167.php. However after thinking about it a bit more, I still fail to see the use case for a maximum quorum number. Having a 'max quorum' also seems to contradict common meanings of 'quorum', which in itself is a minimum, minimum number, least required number, lower limit. Having also sync in the name is useful, imho, so people know it's not about async servers. So then the name would become quorum_sync_standbys or synchronous_standby_quorum. Though I think I wouldn't use 'strict_sync_replication' (explained in http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php) I can imagine others want this feature, to not have the master halt by sync standby failure. If that's what somebody never want, than the way this is solved by this parameter is elegant: only wait if they are connected. In recovery.conf, a boolean to discern between sync and async servers (like in your patch) is IMHO better than mixing 'sync or async' with the replication modes. Together with the replication modes, this could then become synchronous_standby (boolean) synchronous_mode (recv,fsync,replay) You also noticed that my patch addressed 2PC, maybe I will have to add this part to Fujii's patch, too. Note: I haven't yet read his patch, maybe working with LSNs instead of XIDs make this work automatically, I don't know. Yes, and I think we definately need automated replicated cluster tests. A large part of reviewing went into virtual machine setup and cluster setup. I'm not sure if a full test suite that includes node failures could or should be included in the core regression test, but anything that automates setup, a few transactions and failures would benefit everyone working and testing on replication. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Functional dependencies and GROUP BY
On fre, 2010-07-23 at 11:00 -0600, Alex Hunsaker wrote: I just read that patch is getting pushed till at least the next commit fest: http://archives.postgresql.org/pgsql-hackers/2010-07/msg01219.php Should we push this patch back to? Alternatively we could make it work with just primary keys until the other patch gets in. I think that makes sense, find that attached. Thoughts? I was thinking the same thing. Note I axed the index not null attribute checking, I have not thought to deeply about it other than if its a primary key it cant have non null attributes Right? I may have missed something subtle hence the heads up. Another open question I thought of was whether we should put the dependency record on the pg_index row, or the pg_constraint row, or perhaps the pg_class row. Right now, it is using pg_index, because that was easiest to code up, but I suspect that once we have not-null constraints in pg_constraint, it will be more consistent to make all dependencies go against pg_constraint rather than a mix of several catalogs. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of Synchronous Replication patches
Hello Zoltán, Thanks for your reply! Instead, I will post a patch that unifies my configuration choices with Fujii's patch. Please know that Fujii's patch is also a work in progress. Yes, I know that. But working from Fujii's last public patch or from his GIT tree makes my patch easier to merge. I didn't mention in my review the previously discussed items, most important the changing the polling loops (see e.g. http://archives.postgresql.org/pgsql-hackers/2010-07/msg00757.php). Do you have suggestions for better worded GUCs? slave seems to be phased out by standby for political correctness, so synchronous_standby instead of synchronous_slave. You mentioned min_sync_replication_clients - quorum_min_sync_standbys. What else? The 'quorum_min_sync_standbys' came from a reply to the design of Fujii's patch on http://archives.postgresql.org/pgsql-hackers/2010-07/msg01167.php. However after thinking about it a bit more, I still fail to see the use case for a maximum quorum number. Having a 'max quorum' also seems to contradict common meanings of 'quorum', which in itself is a minimum, minimum number, least required number, lower limit. Having also sync in the name is useful, imho, so people know it's not about async servers. So then the name would become quorum_sync_standbys or synchronous_standby_quorum. Ok. Though I think I wouldn't use 'strict_sync_replication' (explained in http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php) I can imagine others want this feature, to not have the master halt by sync standby failure. If that's what somebody never want, than the way this is solved by this parameter is elegant: only wait if they are connected. Thanks, I was thinking about possible use cases and different preferences of people. In recovery.conf, a boolean to discern between sync and async servers (like in your patch) is IMHO better than mixing 'sync or async' with the replication modes. Together with the replication modes, this could then become synchronous_standby (boolean) synchronous_mode (recv,fsync,replay) Ok. You also noticed that my patch addressed 2PC, maybe I will have to add this part to Fujii's patch, too. Note: I haven't yet read his patch, maybe working with LSNs instead of XIDs make this work automatically, I don't know. Yes, and I think we definately need automated replicated cluster tests. A large part of reviewing went into virtual machine setup and cluster setup. I'm not sure if a full test suite that includes node failures could or should be included in the core regression test, but anything that automates setup, a few transactions and failures would benefit everyone working and testing on replication. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] TwoPO: experimental join order algorithm
Hi, I'd like to release the last version of my experimental join order algorithm (TwoPO - Two Phase Optimization [1]): http://git.c3sl.ufpr.br/gitweb?p=lbd/ljqo.git;a=summary This algorithm is not production-ready, but an experimental set of ideas, which need to be refined and evaluated. As the join order optimization is a hard problem, the evaluation of a search strategy is also a hard task. Therefore, I think the most important TODO item related to replacement of GEQO algorithm is to define a set of evaluation criteria considered as relevant. TwoPO is encapsulated in a plug-in called LJQO (Large Join Query Optimization [2]). This plug-in has two main GUC variables: ljqo_threshold = N (like geqo_threshold) ljqo_algorithm = {twopo|geqo} As its name means, TwoPO has internally two search strategies that constitute its two phases of optimization: * Iterative Improvement (II) – Biased Sampling + Local Search * Simulated Annealing (SA) This algorithm also works in two search spaces: * deep-trees (subset of bushy-trees) - list of baserels - initial states: biased sampling using Prim algorithm over join graph (new, very efficient) - moves: swap, 3cycle [2] * bushy-trees - binary join tree representation - initial states: biased sampling using Kruskal's algorithm over join graph [3,4]. - moves: associative You can modify the functionality of TwoPO through the following parameters: twopo_bushy_space = {true|false} - set it to false if you want only deep-trees default=true twopo_heuristic_states = {true|false} - enables heuristic to initial states default=true twopo_ii_stop = Int - number of initial states default=10 twopo_ii_improve_states = {true|false} - find local-minimum of each initial state default=true twopo_sa_phase = {true|false} - enables Simulated Annealing (SA) phase default=true twopo_sa_initial_temperature = Float - initial temperature for SA phase default=0.1 twopo_sa_temperature_reduction = Float - temperature reduction default=0.95 twopo_sa_equilibrium = Int - number of states generated for each temperature (Int * State Size) default=16 References: [1] Yannis E. Ioannidis e Younkyung Kang. Randomized algorithms for optimizing large join queries. SIGMOD Rec., 19(2):312-321, 1990. [2] Arun Swami e Anoop Gupta. Optimization of large join queries. SIGMOD '88: Proceedings of the 1988 ACM SIGMOD international conference on Management of data, pages 8-17, New York, NY, USA, 1988. ACM. [3] P.B. Guttoski, M. S. Sunye, e F. Silva. Kruskal's algorithm for query tree optimization. IDEAS '07: Proceedings of the 11th International Database Engineering and Applications Symposium, pages 296-302, Washington, DC, USA, 2007. IEEE Computer Society. [4] Florian Waas e Arjan Pellenkoft. Join order selection - good enough is easy. BNCOD, pages 51-67, 2000. Att, Adriano Lange -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
Robert Haas wrote: If git had a place to store all the information we care about, that would be fine... There's no reviewer header, and there's no concept that a patch might have come from the author (or perhaps multiple authors), but then have been adjusted by one or more reviewers and then frobnicated some more by the committer ... I don't think that non-linear history is an advantage in any situation. ISTM we could have the best of both those worlds - linear history and authorreviewercommitter information. Instead of squashing every patch into a single commit, what if it got squashed into a perhaps 3 separate commits -- one as submitted, one as reviewed, and one as re-written by the committer. History stays linear; and you keep the most relevant parts of the history, while dropping all the development brainstorming commits. And ISTM the patch reviewer could be responsible for this squashing so it's not much more work for the committer. For example, instead of commit 351c0b92eca40c1a36934cf83fe75db9dc458cde Author: Robert Haas robertmh...@gmail.com Date: Fri Jul 23 00:43:00 2010 + Avoid deep recursion when assigning XIDs to multiple levels of subxacts. Andres Freund, with cleanup and adjustment for older branches by me. we'd see Author: Andreas Freund Date: Fri Jul 23 00:43:00 2010 + Avoid deep recursion when assigning XIDs to multiple levels of subxacts. Path as originally submitted to commit fest Author: [Whomever the reviewer was] Date: Fri Jul 23 00:43:00 2010 + Avoid deep recursion when assigning XIDs to multiple levels of subxacts. Patch marked read for committer by reviewer. Author: Robert Haas robertmh...@gmail.com Date: Fri Jul 23 00:43:00 2010 + Avoid deep recursion when assigning XIDs to multiple levels of subxacts. Patch as rewritten by committer. For a complex enough patch with many authors, the reviewer could choose to keep an extra author commit or two to credit the extra authors. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bg worker: overview
Hi, On 07/23/2010 09:45 PM, Dimitri Fontaine wrote: Yeah, I guess user daemons would have to be workers, not plugins you want to load into the coordinator. Okay. On the other side, the background workers have a connection to exactly one database. They are supposed to do work on that database. Is that because of the way backends are started, and to avoid having to fork new ones too often? For one, yes, I want to avoid having to start ones too often. I did look into letting these background workers switch the database connection, but that turned out not to be worth the effort. Would you prefer a background worker that's not connected to a database, or why are you asking? The background workers can easily load external libraries - just as a normal backend can with LOAD. That would also provide better encapsulation (i.e. an error would only tear down that backend, not the coordinator). You'd certainly have to communicate between the coordinator and the background worker. I'm not sure how match that fits your use case. Pretty well I think. Go ahead, re-use the background workers. That's what I've published them for ;-) Yeah. The connection pool is better outside of code. Let's think PGQ and a internal task scheduler first, if we think at any generalisation. To be honest, I still don't quite grok the concept behind PGQ. So I cannot really comment on this. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bg worker: overview
Markus Wanner mar...@bluegap.ch writes: For one, yes, I want to avoid having to start ones too often. I did look into letting these background workers switch the database connection, but that turned out not to be worth the effort. Would you prefer a background worker that's not connected to a database, or why are you asking? Trying to figure out how it would fit the PGQ and pgagent needs. But maybe user defined daemons should be sub-coordinators (I used to think about them as supervisors) able to talk to the coordinator to get a backend connected to some given database and distribute work to it. You're using iMessage as the data exchange, how are you doing the work distribution? What do you use to tell the backend what is the processing you're interrested into? Go ahead, re-use the background workers. That's what I've published them for Hehe :) The aim of this thread would be to have your input as far as designing an API would go, now that we're about on track as to what the aim is. To be honest, I still don't quite grok the concept behind PGQ. So I cannot really comment on this. In very short, the idea is a clock that ticks and associate current_txid() to now(), so that you're able to say give me 3s worth of transactions activity from this queue. It then provides facilities to organise a queue into batches at consumer request, and for more details, see there: http://github.com/markokr/skytools-dev/blob/master/sql/ticker/pgqd.c http://github.com/markokr/skytools-dev/blob/master/sql/ticker/ticker.c But the important thing as far as making it a child of the coordinator goes would be, I guess, that it's some C code running as a deamon and running SQL queries from time to time. The SQL queries are calling C user defined functions, provided by the PGQ backend module. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
On lör, 2010-07-24 at 07:02 -0700, Ron Mayer wrote: Instead of squashing every patch into a single commit, what if it got squashed into a perhaps 3 separate commits -- one as submitted, one as reviewed, and one as re-written by the committer. History stays linear; and you keep the most relevant parts of the history, while dropping all the development brainstorming commits. But then there would be commits in the main repository that were known not good enough. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
Peter Eisentraut wrote: On lör, 2010-07-24 at 07:02 -0700, Ron Mayer wrote: Instead of squashing every patch into a single commit, what if it got squashed into a perhaps 3 separate commits -- one as submitted, one as reviewed, and one as re-written by the committer. History stays linear; and you keep the most relevant parts of the history, while dropping all the development brainstorming commits. But then there would be commits in the main repository that were known not good enough. Yeah. Also, please bear in mind that our explicit aim here is to make this change with a minimal disruption to existing work flows. So to all those people who want to say Look, you can now do all these cool things my answer is Maybe we'll do them later, but not right now. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
On Sat, 2010-07-24 at 13:48 -0400, Andrew Dunstan wrote: Yeah. Also, please bear in mind that our explicit aim here is to make this change with a minimal disruption to existing work flows. So to all those people who want to say Look, you can now do all these cool things my answer is Maybe we'll do them later, but not right now. Amen brother. Git is a universe different than CVS/SVN. Let's practice KISS for a while shall we. JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar
On 21/07/10 08:33, Mike Fowler wrote: Why is the first argument AexprConst instead of a_expr? The SQL standard says it's a character string literal, but I think we can very well allow arbitrary expressions. Yes, it was AexprConst because of the specification. I also found that using it solved my shift/reduce problems, but I can change it a_expr as see if I can work them out in a different way. [snip] Why c_expr? As with the AexprConst, it's choice was partially influenced by the fact it solved the shift/reduce errors I was getting. I'm guessing than that I should really use a_expr and resolve the shift/reduce problem differently? Attached is the revised version of the patch addressing all the issues raised in the review, except for the use of AexprConst and c_expr. With my limited knowledge of bison I've failed to resolve the shift/reduce errors that are introduced by using a_expr. I'm open to suggestions as my desk is getting annoyed with me beating it in frustration! Thanks again for taking the time to review my work. Regards, -- Mike Fowler Registered Linux user: 379787 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8554,8562 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; ]]/screen /para /sect3 !sect3 titleXML Predicates/title indexterm primaryIS DOCUMENT/primary --- 8554,8570 ]]/screen /para /sect3 +/sect2 !sect2 titleXML Predicates/title + + indexterm + primaryXML Predicates/primary + /indexterm + +sect3 + titleIS DOCUMENT/title indexterm primaryIS DOCUMENT/primary *** *** 8574,8579 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; --- 8582,8619 between documents and content fragments. /para /sect3 + +sect3 + titleXMLEXISTS/title + + indexterm + primaryXMLEXISTS/primary + /indexterm + + synopsis + functionXMLEXISTS/function(replaceablexpath/replaceable optionalPASSING BY REF replaceablexml/replaceable optionalBY REF/optional/optional) + /synopsis + + para + The function functionxmlexists/function returns true if the replaceablexpath/replaceable + returns any nodes and false otherwise. If no replaceablexml/replaceable is passed, the function + will return false as a XPath cannot be evaluated without content. See the + ulink url=http://www.w3.org/TR/xpath/#section-Introduction;W3C recommendation 'XML Path Language'/ulink + for a detailed explanation of why. + /para + + para + Example: + screen![CDATA[ + SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF 'townstownToronto/towntownOttawa/town/towns'); + + xmlexists + + t + (1 row) + ]]/screen + /para +/sect3 /sect2 sect2 id=functions-xml-processing *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *** *** 423,431 static TypeName *TableFuncTypeName(List *columns); %type list opt_check_option %type target xml_attribute_el ! %type list xml_attribute_list xml_attributes %type node xml_root_version opt_xml_root_standalone ! %type ival document_or_content %type boolean xml_whitespace_option %type node common_table_expr --- 423,432 %type list opt_check_option %type target xml_attribute_el ! %type list xml_attribute_list xml_attributes xmlexists_list %type node xml_root_version opt_xml_root_standalone ! %type node xmlexists_query_argument_list ! %type ival document_or_content %type boolean xml_whitespace_option %type node common_table_expr *** *** 511,523 static TypeName *TableFuncTypeName(List *columns); OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER ! PARSER PARTIAL PARTITION PASSWORD PLACING PLANS POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY PRIOR PRIVILEGES PROCEDURAL PROCEDURE QUOTE ! RANGE READ REAL REASSIGN RECHECK RECURSIVE REFERENCES REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE --- 512,524 OBJECT_P OF OFF OFFSET OIDS ON ONLY OPERATOR OPTION OPTIONS OR ORDER OUT_P OUTER_P OVER OVERLAPS OVERLAY OWNED OWNER ! PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY PRIOR PRIVILEGES PROCEDURAL PROCEDURE QUOTE ! RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROW ROWS RULE *** *** 539,545 static TypeName *TableFuncTypeName(List *columns); WHEN WHERE WHITESPACE_P WINDOW WITH WITHOUT WORK WRAPPER WRITE
Re: [HACKERS] patch: Add JSON datatype to PostgreSQL (GSoC, WIP)
Update: I'm in the middle of cleaning up the JSON code ( http://git.postgresql.org/gitweb?p=json-datatype.git;a=summary if you want to see the very latest ), so I haven't addressed all of the major problems with it yet. On Fri, Jul 23, 2010 at 2:34 PM, Robert Haas robertmh...@gmail.com wrote: - I was under the impression that we wanted EXPLAIN (FORMAT JSON) to return type json, but that's obviously not going to be possible if all of this is contrib. We could (a) move it all into core, (b) move the type itself and its input and output functions into core and leave the rest in contrib [or something along those lines], or (c) give up using it as the return type for EXPLAIN (FORMAT JSON). I've been developing it as a contrib module because: * I'd imagine it's easier than developing it as a built-in datatype right away (e.g. editing a .sql.in file versus editing pg_type.h ). * As a module, it has PGXS support, so people can try it out right away rather than having to recompile PostgreSQL. I, for one, think it would be great if the JSON datatype were all in core :-) However, if and how much JSON code should go into core is up for discussion. Thoughts, anyone? A particularly useful aspect of the JSON support is the ability to convert PostgreSQL arrays to JSON arrays (using to_json ), as there currently isn't any streamlined way to parse arrays in the PostgreSQL format client-side (that I know of). Joey Adams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: Add JSON datatype to PostgreSQL (GSoC, WIP)
On Sat, Jul 24, 2010 at 06:57:18PM -0400, Joseph Adams wrote: A particularly useful aspect of the JSON support is the ability to convert PostgreSQL arrays to JSON arrays (using to_json ), as there currently isn't any streamlined way to parse arrays in the PostgreSQL format client-side (that I know of). I really would like to address the latter issue some day. I don't know how many broken implementations I have seen in my, not that long, time using pg, but it sure is 10+. I also knowingly have written dumbed down versions. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Jul 23, 2010, at 7:11 AM, Tom Lane wrote: I can't help thinking that the JDBC driver must be being overly cute if this breaks it ... I was wondering the same thing when I first saw Kris' message. However, iff I understand what JDBC is trying to achieve, I don't think I would call it overly. @Kris Is this a problem because JDBC is trying to detect failures as early as possible during a COPY IN? Or, is it just JDBC's normal MO to always be reading? Well, I've wanted to do the former (early error detection) with py-postgresql's COPY support, and I imagine getting a read event marking completion prior to emitting done/fail could be a snag. cheers, jwp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] patch: utf8_to_unicode (trivial)
In src/include/mb/pg_wchar.h , there is a function unicode_to_utf8 , but no corresponding utf8_to_unicode . However, there is a static function called utf2ucs that does what utf8_to_unicode would do. I'd like this function to be available because the JSON code needs to convert UTF-8 to and from Unicode codepoints, and I'm currently using a separate UTF-8 to codepoint function for that. This patch renames utf2ucs to utf8_to_unicode and makes it public. It also fixes the version of utf2ucs in src/bin/psql/mbprint.c so that it's equivalent to the one in wchar.c . This is a patch against CVS HEAD for application. It compiles and tests successfully. Comments? Thanks, Joey Adams utf8-to-unicode.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers