Re: [HACKERS] Synch Rep for CommitFest 2009-07
Dimitri Fontaine wrote: Le 15 juil. 09 à 23:03, Heikki Linnakangas a écrit : 2. The primary should have no business reading back from the archive. The standby can read from the archive, as it can today. Sorry to insist, but I'm not sold on your consensus here, yet: http://archives.postgresql.org/pgsql-hackers/2009-07/msg00486.php There's a true need for the solution to be simple to install, and providing a side channel for the standby to go read the archives itself isn't it. I think a better way to address that need is to provide a built-in mechanism for the standby to request a base backup and have it sent over the wire. That makes the initial setup very easy. Furthermore, the counter-argument against having the primary able to send data from the archives to some standby is that it should still work when primary's dead, but as this is only done in the setup phase, I don't see that being able to continue preparing a not-yet-ready standby against a dead primary is buying us anything. The situation arises also when the standby falls badly behind. A simple solution to that is to add a switch in the master to specify always keep X MB of WAL in pg_xlog. The standby will then still find it in pg_xlog, making it harder for a standby to fall so much behind that it can't find the WAL it needs in the primary anymore. Tom suggested that we can just give up and re-sync with a new base backup, but that really requires built-in base backup capability, and is only practical for small databases. I think we should definitely have both those features, but it's not urgent. The replication works without them, although requires that you set up traditional archiving as well. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects
On Thu, Jul 16, 2009 at 01:49:14PM +0900, KaiGai Kohei wrote: Joshua, I found your name as a reviewer at the commitfest.postgresql.org. However, I don't think the initial proposal of the largeobject security is now on the state to be reviewed seriously. In my preference, it should be reworked based on the TOASTing approach as discussed in this thread previously, if we can find a reasonable solution for the issue I raised before. For whatever it's worth, I consider my capability limited to making sure the patch applies and coming up with a few interesting ways of testing it out, but not seriously studying the code and knowing when there might be a competitive alternative implementation. I don't yet understand the issues that have been raised, but will quit working to review the patch if you feel it's not ready. Thanks for letting me know; I hope a solution to the problems you've brought up is forthcoming. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com signature.asc Description: Digital signature
Re: [HACKERS] Mostly Harmless: c++reserved - patch 1 of 4
On Friday 05 December 2008 11:13:37 Kurt Harriman wrote: 1. c++reserved I have applied (an extended version of) this patch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synch Rep for CommitFest 2009-07
Hi, Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: I think a better way to address that need is to provide a built-in mechanism for the standby to request a base backup and have it sent over the wire. That makes the initial setup very easy. Great idea :) So I'll reproduce the sketch I did in this other mail, adding the 'base' state where the prerequisite base backup is handled, that will help clarify the next points: 0. base: slave asks the master for a base-backup, at the end of this it reaches the base-lsn 1. init: slave asks the master the current LSN and start streaming WAL 2. setup: slave asks the master for missing WALs from its base-lsn to this LSN it just got, and apply them all to reach initial LSN (this happens in parallel to 1.) 3. catchup: slave has replayed missing WALs and now is replaying the stream he received in parallel, and which applies from init LSN (just reached) 4. sync: slave is applying the stream as it gets it, either as part of the master transaction or not depending on the GUC settings The situation arises also when the standby falls badly behind. A simple solution to that is to add a switch in the master to specify always keep X MB of WAL in pg_xlog. The standby will then still find it in pg_xlog, making it harder for a standby to fall so much behind that it can't find the WAL it needs in the primary anymore. Tom suggested that we can just give up and re-sync with a new base backup, but that really requires built-in base backup capability, and is only practical for small databases. I think that when the standby is back in business after a connection glitch (or any other transient error), its current internal state is still 'sync' and walreceiver asks for next LSN (RedoPTR?). Now, 2 cases are possible: a. primary still has it handy, so the standby is still in sync but lagging behind (and primary knows how much) b. primary is not able to provide the requested WAL entry, so the slave is back to 'setup' state, with base-lsn the point reached just before loosing sync (the one walreceiver just asked for). Now, a standby in 'setup' state isn't ready (yet), and for example synchronous replication won't be possible in this state: we can't ask the primary to refuse to COMMIT any transaction (holding it, eg) while a standby hasn't reached 'sync' state. The way your talking about the issue make me think there's a mix between how to handle a lagging standby and an out-of-sync standby. For clarity, I think we should have very distinct states and responses. And yes, as Tom and you keep saying, a synced standby by definition should not need any access to its primary archives. So if it does, it's no more in sync. I think we should definitely have both those features, but it's not urgent. The replication works without them, although requires that you set up traditional archiving as well. Agreed, it's not essential for the feature as far as hackers are concerned. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Review: support for multiplexing SIGUSR1
Hi, I'm reviewing this patch: http://archives.postgresql.org/message-id/3f0b79eb0907022341m1d36a841x19c3e2a5a6906...@mail.gmail.com This one applies almost cleanly, except for a minor hunk in elog.c and postinit.c Compiles and pass regression tests (i tried both steps in a debian lenny amd turion x2 64bits and in a windows xp sp2) about the patch itself: Tom objects to a previous patch for this here: http://archives.postgresql.org/message-id/14969.1228835...@sss.pgh.pa.us This new patch doesn't use PGPROC struct anymore, instead it uses a ProcSignalSlot struct defined as: typedef struct { pid_t pss_pid; sig_atomic_t pss_signalFlags[NUM_PROCSIGNALS]; } ProcSignalSlot; which, AFAIU, seems to be in sync with Tom's advice here: http://archives.postgresql.org/pgsql-hackers/2008-12/msg00556.php something that make me nervous is this: /* * Note: Since there's no locking, it's possible that the target * process detaches from shared memory and exits right after this * test, before we set the flag and send signal. And the signal slot * might even be recycled by a new process, so it's remotely possible * that we set a flag for a wrong process. That's OK, all the signals * are such that no harm is done if they're mistakenly fired. */ can we signal a wrong process and still be fine? besides, seems like SendProcSignal is still attached to SIGUSR1 only, is this fine? the rest of the patch (or better ways of testing it) is beyond my knowledge... i think a reviewer should take a look on it, specially Tom because he rejected the other one... -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synch Rep for CommitFest 2009-07
Hi, On Thu, Jul 16, 2009 at 6:03 AM, Heikki Linnakangasheikki.linnakan...@enterprisedb.com wrote: I don't think there's much point assigning more reviewers to Synch Rep at this point. I believe we have consensus on four major changes: Thanks for clarifying the issues! Okey, I'll rework the patch. 1. Change the way synchronization is done when standby connects to primary. After authentication, standby should send a message to primary, stating the begin point (where begin is an XLogRecPtr, not a WAL segment name). Primary starts streaming WAL starting from that point, and keeps streaming forever. pg_read_xlogfile() needs to be removed. I assume that begin should indicate the location of the last valid record. In other words, at first the standby tries to recover by using only the XLOG files which exist in its archive or pg_xlog. When it has reached the last valid record, it requests the XLOG records which follow begin to the primary. Is my understanding OK? http://archives.postgresql.org/pgsql-hackers/2009-07/msg00475.php As I described before, the XLOG file which the standby creates should be recoverable. So, when begin indicates the middle of the XLOG file, the primary should start sending the records from the head of the file including begin. Is this OK? Or, the primary should start from begin? In this case, since we can expect that the incomplete file including begin would exist in also the standby, the records following begin need to be appended into it. And, if that incomplete file is the restored one from archive, it would need to be renamed from a temporary name before being appended. A timeline/backup history file is also required for recovery, but it's not found in the standby. So, they need to be shipped from the primary, and this capability is provided by pg_read_xlogfile(). If removing the function, how should we transfer those history files? The function similar to pg_read_xlogfile() with which the filename needs to be specified is still necessary? 2. The primary should have no business reading back from the archive. The standby can read from the archive, as it can today. In this case, a backup history file should be stored in pg_xlog for a while, because it might be requested by the standby. So far pg_start_backup() has removed the previous backup history file soon. We should introduce a new GUC parameter to determine how many backup history files should exist in pg_xlog? CHECKPOINT should not recycle the XLOG files following the file which is requested by the standby in that moment. So, we need to tweak the recycling policy. 3. Need to support multiple WALSenders. While multiple slave support isn't 1st priority right now, it's not acceptable that a new WALSender can't connect while one is active already. That can cause trouble in case of network problems etc. Sorry, I didn't get your point. You think multiple slave support isn't 1st priority, and yet why should multiple walsender mechanism be necessary? Can you describe the problem cases in more detail? 4. It is not acceptable that normal backends have to wait for walsender to send data. Umm... this is true in asynchronous replication case. Also true while the standby is catching up with the primary. After those servers get into synchronization, the backend should wait for walsender to send data (and also walreceiver to write/fsync data) before returning success of COMMIT to the client. Is my understanding right? In current Synch Rep, the backend basically doesn't wait for walsender in asynchronous mode. But only when wal_buffers is filled with unsent data, the backend waits for walsender to send data because there is no room to insert new data. You suggest only that this problem case should be solved? That means that connecting a standby behind a slow connection to the primary can grind the primary to a halt. This is the fate of *synchronous* replication, isn't it? If a user want to get around such problem, asynchronous mode should be chosen, I think. walsender needs to be able to read data from disk, not just from shared memory. (I raised this back in December http://archives.postgresql.org/message-id/495106fa.1050...@enterprisedb.com) OK, I'll try it. As a hint, I think you'll find it a lot easier if you implement only asynchronous replication at first. That reduces the amount of inter-process communication a lot. You can then add synchronous capability in a later commitfest. I would also suggest that for point 4, you implement WAL sender so that it *only* reads from disk at first, and only add the capability send from wal_buffers later on, and only if performance testing shows that it's needed. Sounds good. I'll advance development in stages as you suggested. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] [GENERAL] pg_migrator not setting values of sequences?
Tom Lane t...@sss.pgh.pa.us writes: Bruce Momjian br...@momjian.us writes: A larger question is what do we do with pg_migrator now. I am embarrassed I didn't find these errors before, but now that they are known, and will probably need an 8.4.1 to fix, should I remove the pg_migrator 8.4 source code from pgfoundry? Well, we'd still like people to test it, but at the very least it should be marked as having known bugs. What about having pg_migrator bails out if destination cluster version is 80401 ? Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
Hi Robert, Hi all, The patch applies cleanly and works as intended - no surprise here. After the changes the documentation is at least as easy to understand as before and the code changes look sensible Also not surprisingly that's not the area I expected problems I guess ;-) For performance testing I replayed query logs from sites I easily could get my hands on (3 different, halfway interesting ones). I found no relevant differences on the first site which is sensible because {from,join}_collapse_limit wasn't reached anyway. More interesting are the queries from the two sites having reporting queries: On the first, simpler, schema I found on average 30% plan time increase and 40% execution time decrease. Most of the queries stayed the same, only a few changed radically (in both directions). No big differences between geqo=on/off. The queries on the second reporting schema unfortunately are different. Its the one were I copied the crazy example I attached in the original thread. With geqo=off a good part of the queries used daily use too much memory to plan sensibly and geqo=on outright fails with: Error: Failed to make a valid plan on some. I stopped trying to make performance measurements there. Noticeable even some plans which were plannable in reasonable time before now are problematic with enable_join_ordering=false! I agree that those queries are crazy, but I am not sure how many of those are out there... So, while I think the changes are principally a good idea, as {from,join}_collapse_limit are a bit confusing options, I personally! do not think geqo is ready for it today, especially as the benefit is relatively small. If I am the only one having access to such complicated queries its fine - I am working on the sites query generation/schema anyway. Could perhaps some other people having complicated queries check how they work out with those changes? It should be enough to check with a very big {join,from}_collapse_limit? Kevin? I have also to admit that I somewhat like the current behaviour in theory. Currently you can have a view with hand-optimized JOIN order which will not get inlined and/or reordered use it together with something unoptimized and the unoptimized part will be reordered in many cases... I found it somewhat hard to review a patch were my meaning was biased from beginning. As Tom listed himself listed himself as a reviewer I will happiliy (err?) concede to his and your judgement. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synch Rep for CommitFest 2009-07
Fujii Masao wrote: On Thu, Jul 16, 2009 at 6:03 AM, Heikki Linnakangasheikki.linnakan...@enterprisedb.com wrote: 1. Change the way synchronization is done when standby connects to primary. After authentication, standby should send a message to primary, stating the begin point (where begin is an XLogRecPtr, not a WAL segment name). Primary starts streaming WAL starting from that point, and keeps streaming forever. pg_read_xlogfile() needs to be removed. I assume that begin should indicate the location of the last valid record. In other words, at first the standby tries to recover by using only the XLOG files which exist in its archive or pg_xlog. When it has reached the last valid record, it requests the XLOG records which follow begin to the primary. Is my understanding OK? Yes. http://archives.postgresql.org/pgsql-hackers/2009-07/msg00475.php As I described before, the XLOG file which the standby creates should be recoverable. So, when begin indicates the middle of the XLOG file, the primary should start sending the records from the head of the file including begin. Is this OK? Or, the primary should start from begin? In this case, since we can expect that the incomplete file including begin would exist in also the standby, the records following begin need to be appended into it. I would expect the standby to append to the partial XLOG file. And, if that incomplete file is the restored one from archive, it would need to be renamed from a temporary name before being appended. The archive should not normally contain partial XLOG files, only if you manually copy one there after primary has crashed. So I don't think that's something we need to support. A timeline/backup history file is also required for recovery, but it's not found in the standby. So, they need to be shipped from the primary, and this capability is provided by pg_read_xlogfile(). If removing the function, how should we transfer those history files? The function similar to pg_read_xlogfile() with which the filename needs to be specified is still necessary? Hmm. You only need the timeline history file if the base backup was taken in an earlier timeline. That situation would only arise if you (manually) take a base backup, restore to a server (which creates a new timeline), and then create a slave against that server. At least in the 1st phase, I think we can assume that the standby has access to the same archive, and will find the history file from there. If not, throw an error. We can add more bells and whistles later. CHECKPOINT should not recycle the XLOG files following the file which is requested by the standby in that moment. So, we need to tweak the recycling policy. Yep. 3. Need to support multiple WALSenders. While multiple slave support isn't 1st priority right now, it's not acceptable that a new WALSender can't connect while one is active already. That can cause trouble in case of network problems etc. Sorry, I didn't get your point. You think multiple slave support isn't 1st priority, and yet why should multiple walsender mechanism be necessary? Can you describe the problem cases in more detail? As the patch stands, new walsender connections are refused when one is active already. What if the walsender connection is in a zombie state? For example, it's trying to send WAL to the slave, but the network connection is down, and the packets are going to a black hole. It will take a while for the TCP layer to declare the connection dead, and close the socket. During that time, you can't connect a new slave to the master, or the same slave using a better network connection. The most robust way to fix that is to support multiple walsenders. The zombie walsender can take its time to die, while the new walsender serves the new connection. You could tweak SO_TIMEOUTs and stuff, but even then the standby process could be in some weird hung state. And of course, when we get around to add support for multiple slaves, we'll have to do that anyway. Better get it right to begin with. 4. It is not acceptable that normal backends have to wait for walsender to send data. Umm... this is true in asynchronous replication case. Also true while the standby is catching up with the primary. After those servers get into synchronization, the backend should wait for walsender to send data (and also walreceiver to write/fsync data) before returning success of COMMIT to the client. Is my understanding right? Even in synchronous replication, a backend should only have to wait when it commits. You would only see the difference with very large transactions that write more WAL than fits in wal_buffers, though, like data loading. In current Synch Rep, the backend basically doesn't wait for walsender in asynchronous mode. But only when wal_buffers is filled with unsent data, the backend waits for walsender to send data because there is no room to insert new data. You suggest only that this
[HACKERS] boolean in C
Hi folks, Today I got bitten a bit, trying to write C function for postgresql, that also includes some of company's internal stuff. Needles to say, our stuff defines BOOL, as well as postgresql's c.h include file. Now, for us, we will probably change it, but is there any reason for postgresql nowadays not to use stdbool.h, apart from fact, that no one made an effort ? Having said that, wouldn't it be easy as just replacing all TRUE/FALSE/BOOL with lowercase ones, and including stdbool from c.h ? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_migrator not setting values of sequences?
On Thursday 16 July 2009 07:09:22 Bruce Momjian wrote: Uh, how is this going to behave in 8.5? Do we still dump sequences, and if so, aren't we heading down the road of dumping stuff only because a previous release needed it? Which leads me to a related question: Do you plan to maintain one version of pg_migrator that can upgrade any version to any other version (within reason), or will there be separate binaries, say pg_migrator-8.4 and pg_migrator-8.5, that each can only upgrade from $selfversion-1 to $selfversion? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow count in window query
On Wed, Jul 15, 2009 at 11:18 AM, Pavel Stehulepavel.steh...@gmail.com wrote: postgres=# select avg(a) from (select a, row_number() over (order by a) as r, count(*) over () as rc from x ) p where r in ((rc+1)/2,(rc+2)/2) ; How does this compare to the plain non-windowing SQL implementation: select a from x order by a offset (select trunc(count(*)/2) from x) limit 1 (except that that only works if count(*) is odd). Interestingly finding the median is actually O(n) using Quickselect. Maybe we should provide a C implementation of quickselect as a window function. I'm not sure how to wedge in the concept that the sort is unnecessary even though the ORDER BY is specified though. I'm also not sure how to handle this if the set has to be spooled to disk. Quicksort and Quickselect do a lot of scans throught he data and wouldn't perform well on disk. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow count in window query
I'm also not sure how to handle this if the set has to be spooled to disk. Quicksort and Quickselect do a lot of scans throught he data and wouldn't perform well on disk. I thing, so problem is in aggregate func used as window func - or some missing optimalisation. when I replaced count(*) over () by subselect (SELECT count(*) FROM ...) then I got expected speed. Pavel -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why does pg_standby require libpq.so.5?
Committed. Fujii Masao wrote: Hi, On Thu, Jun 25, 2009 at 10:09 PM, Magnus Hagandermag...@hagander.net wrote: The second question is, is it worth doing this so extremely late in the 8.4 development? After mentioning it quickly in an offlist discussion with Heikki, I think our conclusion was that we should wait with this until the tree opens for 8.5. That's OK. I'll wait for 8.5. It's not a very likely scenario that anybody actually has pg_standby on a machine that doesn't have libpq on it - since it needs to have a PostgreSQL server on it to make any sense My first concern was whether the latest pg_standby can work with old postgres (old libpq.so). Regards, -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] DefaultACLs
Hi Petr, this is first public version of our DefaultACLs patch as described on http://wiki.postgresql.org/wiki/DefaultACL . I have been assigned by Robert to do an initial review of your GRANT ON ALL patch mentioned here (http://archives.postgresql.org/pgsql-hackers/2009-07/msg00207.php) Does this new DefaultACL patch nullify this earlier one? Or it is different and should be looked at first since it was added to the commitfest before the defaultACL patch? It is a bit confusing. Please clarify. Regards, Nikhils It allows GRANT/REVOKE permissions to be inherited by objects based on schema permissions at create type by use of ALTER SCHEMA foo SET DEFAULT PRIVILEGES ON TABLE SELECT TO bar syntax. There is also ADD and DROP for appending and removing those default privileges. It works for tables, views, sequences and functions. More info about syntax and some previous discussion is on wiki. There is also GRANT DEFAULT PRIVILEGES ON tablename which *replaces* current object privileges with the default ones. Only owner can do both of those commands (ALTER SCHEMA can be done only by schema owner and GRANT can be done only by object owner). It adds new catalog table which stores the default permissions for given schema and object type. We didn't add syscache entry for that as Stephen Frost didn't feel we should do that (yet). Three functions were also exported from aclchk.c because most of the ALTER SCHEMA stuff is done in schemacmds.c. The current version is fully working and includes some regression tests. There is however no documentation at this moment. Patch is against current Git HEAD (it is context diff). -- Regards Petr Jelinek (PJMODOS) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_migrator not setting values of sequences?
Dimitri Fontaine wrote: Tom Lane t...@sss.pgh.pa.us writes: Bruce Momjian br...@momjian.us writes: A larger question is what do we do with pg_migrator now. I am embarrassed I didn't find these errors before, but now that they are known, and will probably need an 8.4.1 to fix, should I remove the pg_migrator 8.4 source code from pgfoundry? Well, we'd still like people to test it, but at the very least it should be marked as having known bugs. What about having pg_migrator bails out if destination cluster version is 80401 ? We can do that, but because 8.4.1 is not released yet, we might as well just remove the source code. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
--On 16. Juli 2009 11:12:34 +0100 Grzegorz Jaskiewicz g...@pointblue.com.pl wrote: Now, for us, we will probably change it, but is there any reason for postgresql nowadays not to use stdbool.h, apart from fact, that no one made an effort ? Having said that, wouldn't it be easy as just replacing all TRUE/FALSE/BOOL with lowercase ones, and including stdbool from c.h ? If i remember correctly, older SUN compilers doesn't have it. Don't know if that is still an issue yet, but we need to take care for non-gcc compilers i think. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] DefaultACLs
Nikhil Sontakke wrote: Does this new DefaultACL patch nullify this earlier one? Or it is different and should be looked at first since it was added to the commitfest before the defaultACL patch? It is a bit confusing. Please clarify. No, DefaultACLs applies to objects created in the future while GRANT ON ALL affects existing objects. DefaultACLs is more important functionality so it should probably take precedence in review process. There is however one thing that needs some attention. Both patches add distinction between VIEW and TABLE objects for acls into parser and they both do it differently. GRANT ON ALL works by adding ACL_OBJECT_VIEW and tracks that object type in code (that was my original method in both patches) while DefaultACLs uses method suggested by Stephen Frost which is creating new enum with relation, view, function and sequence members (those are object types for which both DefaultACLs and GRANT ON ALL are applicable). The second method has advantage of minimal changes to existing code. It's pointless to use both methods so one of the patches will have to be adjusted. The problem is that most people seem to dislike the addition of ACL_OBJECT_VIEW but on the other hand I don't like the idea of adding another object type variable into GrantStmt struct which would be needed if we adjusted GRANT ON ALL to Stephen Frost's method. -- Regards Petr Jelinek (PJMODOS) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synch Rep for CommitFest 2009-07
On Jul 16, 2009, at 12:07 AM, Heikki Linnakangas wrote: Dimitri Fontaine wrote: Le 15 juil. 09 à 23:03, Heikki Linnakangas a écrit : Furthermore, the counter-argument against having the primary able to send data from the archives to some standby is that it should still work when primary's dead, but as this is only done in the setup phase, I don't see that being able to continue preparing a not-yet- ready standby against a dead primary is buying us anything. The situation arises also when the standby falls badly behind. A simple solution to that is to add a switch in the master to specify always keep X MB of WAL in pg_xlog. The standby will then still find it in pg_xlog, making it harder for a standby to fall so much behind that it can't find the WAL it needs in the primary anymore. Tom suggested that we can just give up and re-sync with a new base backup, but that really requires built-in base backup capability, and is only practical for small databases. If you use an rsync like algorithm for doing the base backups wouldn't that increase the size of the database for which it would still be practical to just re-sync? Couldn't you in fact sync a very large database if the amount of actual change in the files was a small percentage of the total size? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
On 16 Jul 2009, at 12:52, Bernd Helmle wrote: --On 16. Juli 2009 11:12:34 +0100 Grzegorz Jaskiewicz g...@pointblue.com.pl wrote: Now, for us, we will probably change it, but is there any reason for postgresql nowadays not to use stdbool.h, apart from fact, that no one made an effort ? Having said that, wouldn't it be easy as just replacing all TRUE/FALSE/BOOL with lowercase ones, and including stdbool from c.h ? If i remember correctly, older SUN compilers doesn't have it. Don't know if that is still an issue yet, but we need to take care for non- gcc compilers i think. For those guys, we could just put in c.h as it is now... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
oh, another thing. stdbool is C99 standard feature. Not gcc extension. Just in case, someone thinks otherwise. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_migrator not setting values of sequences?
Peter Eisentraut pete...@gmx.net writes: Which leads me to a related question: Do you plan to maintain one version of pg_migrator that can upgrade any version to any other version (within reason), or will there be separate binaries, say pg_migrator-8.4 and pg_migrator-8.5, that each can only upgrade from $selfversion-1 to $selfversion? I think we should plan on the latter, at least until we have a few versions under our belts and can see what we might be letting ourselves in for. My guess is that n-1 to n will be plenty challenging enough. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow count in window query
2009/7/16 Greg Stark gsst...@mit.edu: On Wed, Jul 15, 2009 at 11:18 AM, Pavel Stehulepavel.steh...@gmail.com wrote: postgres=# select avg(a) from (select a, row_number() over (order by a) as r, count(*) over () as rc from x ) p where r in ((rc+1)/2,(rc+2)/2) ; How does this compare to the plain non-windowing SQL implementation: select a from x order by a offset (select trunc(count(*)/2) from x) limit 1 (except that that only works if count(*) is odd). Interestingly finding the median is actually O(n) using Quickselect. Maybe we should provide a C implementation of quickselect as a window function. I'm not sure how to wedge in the concept that the sort is unnecessary even though the ORDER BY is specified though. median() should be aggregate, not window function, shouldn't it? I'm also not sure how to handle this if the set has to be spooled to disk. Quicksort and Quickselect do a lot of scans throught he data and wouldn't perform well on disk. The WindowAgg spools rows into the tuplestore, which holds the data in memory as far as it fits in. Do you have any idea how it stores millons of millions of rows without tuplestore? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
Andres Freund and...@anarazel.de writes: The queries on the second reporting schema unfortunately are different. Its the one were I copied the crazy example I attached in the original thread. With geqo=off a good part of the queries used daily use too much memory to plan sensibly and geqo=on outright fails with: Error: Failed to make a valid plan on some. We're not going to be able to fix this unless you show us examples. Noticeable even some plans which were plannable in reasonable time before now are problematic with enable_join_ordering=false! And this even more so --- it doesn't make any sense at all. So, while I think the changes are principally a good idea, as {from,join}_collapse_limit are a bit confusing options, I personally! do not think geqo is ready for it today, especially as the benefit is relatively small. In general I think that any such bugs are there anyway and need to be fixed anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thursday 16 July 2009 15:13:02 Tom Lane wrote: Andres Freund and...@anarazel.de writes: The queries on the second reporting schema unfortunately are different. Its the one were I copied the crazy example I attached in the original thread. With geqo=off a good part of the queries used daily use too much memory to plan sensibly and geqo=on outright fails with: Error: Failed to make a valid plan on some. We're not going to be able to fix this unless you show us examples. In the other thread I attached a similar to the real schema + example query. Not enough? And why? Noticeable even some plans which were plannable in reasonable time before now are problematic with enable_join_ordering=false! And this even more so --- it doesn't make any sense at all. Why? With a high from_collapse_limit more subqueries get inlined - before inlining they get planned separately. So, while I think the changes are principally a good idea, as {from,join}_collapse_limit are a bit confusing options, I personally! do not think geqo is ready for it today, especially as the benefit is relatively small. In general I think that any such bugs are there anyway and need to be fixed anyway. Understandable. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
Grzegorz Jaskiewicz g...@pointblue.com.pl writes: oh, another thing. stdbool is C99 standard feature. We are still targeting C89, not C99. Another reason not to depend on stdbool is that, so far as I can see, the standard does not promise that type _Bool has size = 1 byte. We have to have that because of on-disk compatibility requirements. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
On 16 Jul 2009, at 14:20, Tom Lane wrote: Grzegorz Jaskiewicz g...@pointblue.com.pl writes: oh, another thing. stdbool is C99 standard feature. We are still targeting C89, not C99. Another reason not to depend on stdbool is that, so far as I can see, the standard does not promise that type _Bool has size = 1 byte. We have to have that because of on-disk compatibility requirements. I think the latter is easily fixable, or forceable to be one byte. Why C89, and not C99 ? Virtually all compilers for last 4 years have/ had C99 support. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thursday 16 July 2009 15:18:01 Andres Freund wrote: On Thursday 16 July 2009 15:13:02 Tom Lane wrote: Andres Freund and...@anarazel.de writes: The queries on the second reporting schema unfortunately are different. Its the one were I copied the crazy example I attached in the original thread. With geqo=off a good part of the queries used daily use too much memory to plan sensibly and geqo=on outright fails with: Error: Failed to make a valid plan on some. We're not going to be able to fix this unless you show us examples. In the other thread I attached a similar to the real schema + example query. Not enough? And why? For reference: http://archives.postgresql.org/message- id/200907091700.43411.and...@anarazel.de Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
Grzegorz Jaskiewicz g...@pointblue.com.pl writes: Why C89, and not C99 ? Virtually all compilers for last 4 years have/ had C99 support. Not everybody is running a compiler released within the last 4 years. The short answer is that C99 doesn't appear to offer enough advantage over C89, *for our purposes*, to justify freezing out older systems. stdbool is a perfect example of an addition that offers precisely zero actual functional improvement. All it would be for us is an additional autoconf headache and portability hazard. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow count in window query
2009/7/16 Pavel Stehule pavel.steh...@gmail.com: I'm also not sure how to handle this if the set has to be spooled to disk. Quicksort and Quickselect do a lot of scans throught he data and wouldn't perform well on disk. I thing, so problem is in aggregate func used as window func - or some missing optimalisation. when I replaced count(*) over () by subselect (SELECT count(*) FROM ...) then I got expected speed. WindowAgg always spools its input in the buffer though (in your case) it throws away row by row, so compared with pure aggregate it has overhead. I think this is reasonable approach for large data situation and different type of window. But yes, we must improve the current model. 1) There should be some kind of lightweight approach for such small-data/simple-window situations. 2) tuplestore_puttupleslot() seems to me heavy (copy, check, etc) even if the data fits in the memory by triming rows. We want to have more flexible temporary storage on the fly. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Psql List Languages
On Thursday 16 July 2009 00:38:31 Fernando Ike de Oliveira wrote: I applied the Tom Lane and Peter considerations, but I had that remove one column (Owner) of out command \dL to compatibility with 7.4 version. The mandate is to work as best as they can with older versions, not to provide only the feature set that works the same across old versions. The correct behavior should be to show the owner column if the server version supports it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
On Thursday 16 July 2009 16:23:31 Grzegorz Jaskiewicz wrote: On 16 Jul 2009, at 14:20, Tom Lane wrote: Grzegorz Jaskiewicz g...@pointblue.com.pl writes: oh, another thing. stdbool is C99 standard feature. We are still targeting C89, not C99. Another reason not to depend on stdbool is that, so far as I can see, the standard does not promise that type _Bool has size = 1 byte. We have to have that because of on-disk compatibility requirements. I think the latter is easily fixable, or forceable to be one byte. How do you plan to do that? Why C89, and not C99 ? Virtually all compilers for last 4 years have/ had C99 support. Well, I think we want to run on systems that are older than 4 years, too. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
On 16 Jul 2009, at 14:53, Peter Eisentraut wrote: On Thursday 16 July 2009 16:23:31 Grzegorz Jaskiewicz wrote: On 16 Jul 2009, at 14:20, Tom Lane wrote: Grzegorz Jaskiewicz g...@pointblue.com.pl writes: oh, another thing. stdbool is C99 standard feature. We are still targeting C89, not C99. Another reason not to depend on stdbool is that, so far as I can see, the standard does not promise that type _Bool has size = 1 byte. We have to have that because of on-disk compatibility requirements. I think the latter is easily fixable, or forceable to be one byte. How do you plan to do that? by casting it to 1 byte type such as char ? I don't think anyone will add 3rd state to boolean in stdbool, at least not any time soon :) And it is pretty annoying, when your product also has its own BOOLean defined... Why C89, and not C99 ? Virtually all compilers for last 4 years have/ had C99 support. Well, I think we want to run on systems that are older than 4 years, too. Sure, but that's probably less than 1% of all systems. The 4 years was a guess, I think its much more than that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mostly Harmless: c++bookends - patch 2 of 4
On Mon, Jul 13, 2009 at 5:51 PM, Peter Eisentrautpete...@gmx.net wrote: So I think either decoration is added to all of these files or none of them. And I think the former is not going to go over well. We do have some things that are conditioned on __cplusplus already, such as c.h, pg_config.h.in, and postgres_ext.h. So at some point we at least thought about supporting inclusion of our header files from C++. But I agree that if we're going to do it at all, we ought to do it everywhere. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] plpythonu datatype conversion improvements
On Wednesday 27 May 2009 02:07:33 Caleb Welton wrote: Patch for plpythonu This patch doesn't apply; I think it got mangled during email transport. (Tabs changed to spaces, it looks like.) Could you resend the patch as a separate attachment in a way that it doesn't get mangled? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] revised hstore patch
Revision to previous hstore patch to fix (and add tests for) some edge case bugs with nulls or empty arrays. -- Andrew (irc:RhodiumToad) hstore_20090716.patch.gz Description: hstore patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
Grzegorz Jaskiewicz g...@pointblue.com.pl writes: On 16 Jul 2009, at 14:53, Peter Eisentraut wrote: the standard does not promise that type _Bool has size = 1 byte. We have to have that because of on-disk compatibility requirements. I think the latter is easily fixable, or forceable to be one byte. How do you plan to do that? by casting it to 1 byte type such as char ? That's hardly going to improve readability for anyone. Also, it will flat out not work for the catalog struct declarations. When we say bool relhasindex; the compiler had better think that that's a one-byte field. And it is pretty annoying, when your product also has its own BOOLean defined... IOW you're not using stdbool either? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mostly Harmless: c++bookends - patch 2 of 4
On Thursday 16 July 2009 17:00:03 Robert Haas wrote: On Mon, Jul 13, 2009 at 5:51 PM, Peter Eisentrautpete...@gmx.net wrote: So I think either decoration is added to all of these files or none of them. And I think the former is not going to go over well. We do have some things that are conditioned on __cplusplus already, such as c.h, pg_config.h.in, and postgres_ext.h. So at some point we at least thought about supporting inclusion of our header files from C++. But I agree that if we're going to do it at all, we ought to do it everywhere. We do support using the frontend headers (libpq, ecpg) from C++. That's what postgres_ext.h is about. The code in pg_config.h.in is autogenerated by Autoconf. The stuff in c.h is probably still there from before we rearranged the header files so that the frontend includes don't use c.h. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
On 16 Jul 2009, at 15:17, Tom Lane wrote: Grzegorz Jaskiewicz g...@pointblue.com.pl writes: That's hardly going to improve readability for anyone. Also, it will flat out not work for the catalog struct declarations. When we say bool relhasindex; the compiler had better think that that's a one-byte field. Sure, but I would certainly hope, there's not too many places where you actually convert it from disc representation, to internal and vice versa. And it is pretty annoying, when your product also has its own BOOLean defined... IOW you're not using stdbool either? Well, saying that I don't is quite an overstatement. It was decided long before I started working for that customer, and is full of problems like that. But still, it would be nice for postgresql to at least not cause problems like that. Having said that, I will probably fix it on customer's side, but I wanted to see if you guys will be happy with patch that changes that in postgresql. thanks . -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
Andres Freund and...@anarazel.de writes: On Thursday 16 July 2009 15:13:02 Tom Lane wrote: Andres Freund and...@anarazel.de writes: Error: Failed to make a valid plan We're not going to be able to fix this unless you show us examples. In the other thread I attached a similar to the real schema + example query. Not enough? And why? I tried the example query and couldn't get Failed to make a valid plan out of it ... what settings do you need for that? However, I do observe that this seems a sufficient counterexample against the theory that we can just remove the collapse limits and let GEQO save us on very complex queries. On my machine, the example query takes about 22 seconds to plan using CVS HEAD w/ all default settings. If I set both collapse_limit variables to very high values (I used 999), it takes ... um ... not sure; I gave up waiting after half an hour. I also tried with geqo_effort reduced to the minimum of 1, but that didn't produce a plan in reasonable time either (I gave up after ten minutes). So if we remove the collapse limits, Postgres will completely fail on this query --- the only way out would be enable_join_ordering = off, which is hardly likely to produce a decent plan. Maybe we should leave the collapse_limit logic alone and address Robert's gripes by just raising the default values a lot (I'm thinking 100 or so). That way there's an escape hatch for anyone who has pathological queries to deal with --- just dial the settings down. Noticeable even some plans which were plannable in reasonable time before now are problematic with enable_join_ordering=false! And this even more so --- it doesn't make any sense at all. Why? With a high from_collapse_limit more subqueries get inlined - before inlining they get planned separately. Okay, I misunderstood which two cases you were comparing there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thu, Jul 16, 2009 at 4:16 PM, Tom Lanet...@sss.pgh.pa.us wrote: However, I do observe that this seems a sufficient counterexample against the theory that we can just remove the collapse limits and let GEQO save us on very complex queries. On my machine, the example query takes about 22 seconds to plan using CVS HEAD w/ all default settings. If I set both collapse_limit variables to very high values (I used 999), it takes ... um ... not sure; I gave up waiting after half an hour. What's the point of GEQO if it doesn't guarantee to produce the optimal plana and *also* doesn't guarantee to produce some plan, any plan, within some reasonable amount of time? Either we need to fix that or else I don't see what it's buying us over our regular planner which also might not produce a plan within a reasonable amount of time but at least if it does it'll be the right plan. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
Grzegorz Jaskiewicz píše v čt 16. 07. 2009 v 14:59 +0100: Why C89, and not C99 ? Virtually all compilers for last 4 years have/ had C99 support. Well, I think we want to run on systems that are older than 4 years, too. Sure, but that's probably less than 1% of all systems. The 4 years was a guess, I think its much more than that. For example Solaris 8 is 9 years old and still is used in production. I guess HP-UX is in same situation. And so on. I not able to say how many PostgreSQL runs on them but how Tom mentioned there is no significant reason to break old platform. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
I wrote: If I set both collapse_limit variables to very high values (I used 999), it takes ... um ... not sure; I gave up waiting after half an hour. I also tried with geqo_effort reduced to the minimum of 1, but that didn't produce a plan in reasonable time either (I gave up after ten minutes). After I gave up letting the machine be idle to get a fair timing, I turned on oprofile monitoring. It looks a bit interesting: samples %image name symbol name 886498 53.8090 postgres have_relevant_eclass_joinclause 460596 27.9574 postgres bms_overlap 1427648.6655 postgres bms_is_subset 1262747.6646 postgres have_join_order_restriction 14205 0.8622 postgres list_nth_cell 2721 0.1652 postgres generate_join_implied_equalities 2445 0.1484 libc-2.9.so memset 2202 0.1337 postgres have_relevant_joinclause 1678 0.1019 postgres make_canonical_pathkey 1648 0.1000 postgres pfree 884 0.0537 postgres bms_union 762 0.0463 postgres gimme_tree 660 0.0401 libc-2.9.so memcpy 571 0.0347 postgres AllocSetFree 475 0.0288 postgres AllocSetAlloc 431 0.0262 postgres has_relevant_eclass_joinclause 389 0.0236 postgres check_list_invariants 260 0.0158 postgres join_is_legal 238 0.0144 postgres bms_copy So maybe a redesign of the equivalence-class joinclause mechanism is in order. Still, this is unlikely to fix the fundamental issue that the time for large join problems grows nonlinearly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thu, Jul 16, 2009 at 04:27:39PM +0100, Greg Stark wrote: On Thu, Jul 16, 2009 at 4:16 PM, Tom Lanet...@sss.pgh.pa.us wrote: However, I do observe that this seems a sufficient counterexample against the theory that we can just remove the collapse limits and let GEQO save us on very complex queries. ?On my machine, the example query takes about 22 seconds to plan using CVS HEAD w/ all default settings. If I set both collapse_limit variables to very high values (I used 999), it takes ... um ... not sure; I gave up waiting after half an hour. What's the point of GEQO if it doesn't guarantee to produce the optimal plana and *also* doesn't guarantee to produce some plan, any plan, within some reasonable amount of time? Either we need to fix that or else I don't see what it's buying us over our regular planner which also might not produce a plan within a reasonable amount of time but at least if it does it'll be the right plan. I do agree that we should have an actually time limit cap for GEQO that would have it return the best plan so far at that time. Then you can at least bound your planning time. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
Grzegorz Jaskiewicz g...@pointblue.com.pl writes: On 16 Jul 2009, at 15:17, Tom Lane wrote: That's hardly going to improve readability for anyone. Also, it will flat out not work for the catalog struct declarations. When we say bool relhasindex; the compiler had better think that that's a one-byte field. Sure, but I would certainly hope, there's not too many places where you actually convert it from disc representation, to internal and vice versa. We don't convert --- it's expected to be the same representation. As for not too many of them, I think grepping for references to bool catalog fields will show you differently ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thursday 16 July 2009 17:16:31 Tom Lane wrote: Andres Freund and...@anarazel.de writes: On Thursday 16 July 2009 15:13:02 Tom Lane wrote: Andres Freund and...@anarazel.de writes: Error: Failed to make a valid plan We're not going to be able to fix this unless you show us examples. In the other thread I attached a similar to the real schema + example query. Not enough? And why? I tried the example query and couldn't get Failed to make a valid plan out of it ... what settings do you need for that? It unfortunately depends on settings and luck. This dependence on luck was the reason why I liked geqo to behave somewhat deterministically... With {join,from}_collapse_limit = 100 it seems to be triggered reliably. With lower values it seems harder trigger, with bigger it simply takes too long to even get there. Efficiencywise using geqo with higher limits nearly all time is spent in: geqo gimme_tree have_join_order_restriction has_legal_joinclause have_relevant_joinclause have_relevant_eclass (30% self) bms_overlap (50%self) I am not yet fully understanding geqo, but it looks like there are some possibilities to improve this. Although such efficiency improvements would no not explain the completely failing plans... Do you have an idea which kind of plans benefit most from using geqo? I had a somewhat hard time finding any query were geqo was substantially faster than the standard join search. That also somewhat explains why I saw improvements with 64bit bitmapsets... However, I do observe that this seems a sufficient counterexample against the theory that we can just remove the collapse limits and let GEQO save us on very complex queries. On my machine, the example query takes about 22 seconds to plan using CVS HEAD w/ all default settings. If I set both collapse_limit variables to very high values (I used 999), it takes ... um ... not sure; I gave up waiting after half an hour. I also tried with geqo_effort reduced to the minimum of 1, but that didn't produce a plan in reasonable time either (I gave up after ten minutes). So if we remove the collapse limits, Postgres will completely fail on this query --- the only way out would be enable_join_ordering = off, which is hardly likely to produce a decent plan. Maybe we should leave the collapse_limit logic alone and address Robert's gripes by just raising the default values a lot (I'm thinking 100 or so). That way there's an escape hatch for anyone who has pathological queries to deal with --- just dial the settings down. Yes, I think thats sensible. I don't know if there are any queries out there that benefit from a higher limits. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thu, Jul 16, 2009 at 4:32 PM, Tom Lanet...@sss.pgh.pa.us wrote: samples % image name symbol name 886498 53.8090 postgres have_relevant_eclass_joinclause 460596 27.9574 postgres bms_overlap So maybe a redesign of the equivalence-class joinclause mechanism is in order. Still, this is unlikely to fix the fundamental issue that the time for large join problems grows nonlinearly. Perhaps it's GEQO's fault that it's using these functions inappropriately, calling them often to calculate these answers whenever it needs them instead of looking once for join clauses and then optimizing based on the results. But I've never actually looked at geqo, mabe that's inherent in the design? -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synch Rep for CommitFest 2009-07
Rick Gigger wrote: If you use an rsync like algorithm for doing the base backups wouldn't that increase the size of the database for which it would still be practical to just re-sync? Couldn't you in fact sync a very large database if the amount of actual change in the files was a small percentage of the total size? It would certainly help to reduce the network traffic, though you'd still have to scan all the data to see what has changed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
Greg Stark gsst...@mit.edu writes: On Thu, Jul 16, 2009 at 4:32 PM, Tom Lanet...@sss.pgh.pa.us wrote: So maybe a redesign of the equivalence-class joinclause mechanism is in order. Still, this is unlikely to fix the fundamental issue that the time for large join problems grows nonlinearly. Perhaps it's GEQO's fault that it's using these functions inappropriately, calling them often to calculate these answers whenever it needs them instead of looking once for join clauses and then optimizing based on the results. But I've never actually looked at geqo, mabe that's inherent in the design? geqo isn't doing anything the regular planner wouldn't do under similar conditions. It might well be that better caching is the answer to this particular problem, but I don't have time to look closer today. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thursday 16 July 2009 17:27:39 Greg Stark wrote: On Thu, Jul 16, 2009 at 4:16 PM, Tom Lanet...@sss.pgh.pa.us wrote: However, I do observe that this seems a sufficient counterexample against the theory that we can just remove the collapse limits and let GEQO save us on very complex queries. On my machine, the example query takes about 22 seconds to plan using CVS HEAD w/ all default settings. If I set both collapse_limit variables to very high values (I used 999), it takes ... um ... not sure; I gave up waiting after half an hour. What's the point of GEQO if it doesn't guarantee to produce the optimal plana and *also* doesn't guarantee to produce some plan, any plan, within some reasonable amount of time? Either we need to fix that or else I don't see what it's buying us over our regular planner which also might not produce a plan within a reasonable amount of time but at least if it does it'll be the right plan. Well, I could not find a plan where it errored out with the old limits. So one could argue its just not adapted. Although I also could not find a single case where geqo was relevantly faster with the default settings even if it was used. The default settings currently make it relatively hard to trigger geqo at all. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
Andres Freund and...@anarazel.de writes: The default settings currently make it relatively hard to trigger geqo at all. Yes, and that was intentional. One of the implications of what we're discussing here is that geqo would get used a lot more for typical complex queries (if there is any such thing as a typical one). So it's fully to be expected that the fallout would be pressure to improve geqo in various ways. Given that we are at the start of the development cycle, that prospect doesn't scare me --- there's plenty of time to fix whatever needs fixing. However, I am leaning to the feeling that I don't want to be putting people in a position where they have no alternative but to use geqo. So adjusting rather than removing the collapse limits is seeming like a good idea. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow count in window query
2009/7/16 Hitoshi Harada umi.tan...@gmail.com: 2009/7/16 Greg Stark gsst...@mit.edu: On Wed, Jul 15, 2009 at 11:18 AM, Pavel Stehulepavel.steh...@gmail.com wrote: postgres=# select avg(a) from (select a, row_number() over (order by a) as r, count(*) over () as rc from x ) p where r in ((rc+1)/2,(rc+2)/2) ; How does this compare to the plain non-windowing SQL implementation: select a from x order by a offset (select trunc(count(*)/2) from x) limit 1 (except that that only works if count(*) is odd). Interestingly finding the median is actually O(n) using Quickselect. Maybe we should provide a C implementation of quickselect as a window function. I'm not sure how to wedge in the concept that the sort is unnecessary even though the ORDER BY is specified though. median() should be aggregate, not window function, shouldn't it? yes - the core of my topic is significant slowness query, that use window functions, when aggregate function was used. This case could be simply optimized. This case isn't important for me. Simply I played with w.f. and I found Celko's query - and I was surprised, because this query was faster, then other - I expected some else. I'm also not sure how to handle this if the set has to be spooled to disk. Quicksort and Quickselect do a lot of scans throught he data and wouldn't perform well on disk. The WindowAgg spools rows into the tuplestore, which holds the data in memory as far as it fits in. Do you have any idea how it stores millons of millions of rows without tuplestore? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
Andres Freund and...@anarazel.de writes: On Thursday 16 July 2009 17:16:31 Tom Lane wrote: I tried the example query and couldn't get Failed to make a valid plan out of it ... what settings do you need for that? It unfortunately depends on settings and luck. This dependence on luck was the reason why I liked geqo to behave somewhat deterministically... With {join,from}_collapse_limit = 100 it seems to be triggered reliably. With lower values it seems harder trigger, with bigger it simply takes too long to even get there. OK, I see it at 100. Would you confirm that what you get is the failure in random_init_pool (geqo_pool.c) not the identically-phrased message elsewhere? (If you have VERBOSITY = verbose you should see the error location info.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thursday 16 July 2009 18:23:06 Tom Lane wrote: Andres Freund and...@anarazel.de writes: On Thursday 16 July 2009 17:16:31 Tom Lane wrote: I tried the example query and couldn't get Failed to make a valid plan out of it ... what settings do you need for that? It unfortunately depends on settings and luck. This dependence on luck was the reason why I liked geqo to behave somewhat deterministically... With {join,from}_collapse_limit = 100 it seems to be triggered reliably. With lower values it seems harder trigger, with bigger it simply takes too long to even get there. OK, I see it at 100. Would you confirm that what you get is the failure in random_init_pool (geqo_pool.c) not the identically-phrased message elsewhere? (If you have VERBOSITY = verbose you should see the error location info.) Yes. I should have seen that. Its not exactly surprising... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thursday 16 July 2009 17:59:58 Tom Lane wrote: Andres Freund and...@anarazel.de writes: The default settings currently make it relatively hard to trigger geqo at all. Yes, and that was intentional. One of the implications of what we're discussing here is that geqo would get used a lot more for typical complex queries (if there is any such thing as a typical one). So it's fully to be expected that the fallout would be pressure to improve geqo in various ways. Given that we are at the start of the development cycle, that prospect doesn't scare me --- there's plenty of time to fix whatever needs fixing. However, I am leaning to the feeling that I don't want to be putting people in a position where they have no alternative but to use geqo. So adjusting rather than removing the collapse limits is seeming like a good idea. Hm. I see a, a bit more fundamental problem with geqo: I tried several queries, and I found not a single one, where the whole genetical process did any significant improvments to the 'worth'. It seems that always the best variant out of the pool is either the path choosen in the end, or at least the cost difference is _really_ low. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synch Rep for CommitFest 2009-07
On Thu, Jul 16, 2009 at 4:41 PM, Heikki Linnakangasheikki.linnakan...@enterprisedb.com wrote: Rick Gigger wrote: If you use an rsync like algorithm for doing the base backups wouldn't that increase the size of the database for which it would still be practical to just re-sync? Couldn't you in fact sync a very large database if the amount of actual change in the files was a small percentage of the total size? It would certainly help to reduce the network traffic, though you'd still have to scan all the data to see what has changed. The fundamental problem with pushing users to start over with a new base backup is that there's no relationship between the size of the WAL and the size of the database. You can plausibly have a system with extremely high transaction rate generating WAL very quickly, but where the whole database fits in a few hundred megabytes. In that case you could be behind by only a few minutes and have it be faster to take a new base backup. Or you could have a petabyte database which is rarely updated. In which case it might be faster to apply weeks' worth of logs than to try to take a base backup. Only the sysadmin is actually going to know which makes more sense. Unless we start tieing WAL parameters to the database size or something like that. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thu, Jul 16, 2009 at 12:49 PM, Andres Freundand...@anarazel.de wrote: On Thursday 16 July 2009 17:59:58 Tom Lane wrote: Andres Freund and...@anarazel.de writes: The default settings currently make it relatively hard to trigger geqo at all. Yes, and that was intentional. One of the implications of what we're discussing here is that geqo would get used a lot more for typical complex queries (if there is any such thing as a typical one). So it's fully to be expected that the fallout would be pressure to improve geqo in various ways. Given that we are at the start of the development cycle, that prospect doesn't scare me --- there's plenty of time to fix whatever needs fixing. However, I am leaning to the feeling that I don't want to be putting people in a position where they have no alternative but to use geqo. So adjusting rather than removing the collapse limits is seeming like a good idea. Hm. I see a, a bit more fundamental problem with geqo: I tried several queries, and I found not a single one, where the whole genetical process did any significant improvments to the 'worth'. It seems that always the best variant out of the pool is either the path choosen in the end, or at least the cost difference is _really_ low. Ouch. Did you insert some debugging code to get that information, or how did you come to that conclusion? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Thu, 2009-07-16 at 15:22 +1000, Brendan Jurd wrote: I had a play around with the feature in psql. I think the syntax is okay, but using ALTER TABLE ... ADD as you mentioned upthread could be a better option. Ok, I think we're pretty much settled on that option then. Another idea that I thought about is that: ALTER TABLE foo ADD UNIQUE (a, b) USING foo_idx; could be a shorthand for: ALTER TABLE foo ADD INDEX CONSTRAINT (a =, b =) USING foo_idx; The benefit is that it could go over GiST indexes or hash indexes, not just btrees. The syntax could also be useful to turn an existing btree into a unique btree. I noticed that there's no change to the output of \d in psql to show the constraint, so when I do a \d on my test table, I can see that there's a gist index there, but I can't tell that there is also a constraint on it. This seems like a pretty significant shortcoming. Essentially once you've created one of these index constraints, it vanishes into the catalogs and becomes invisible to the user. This might call for a modification of pg_get_indexdef()? I agree, that's important. Psql support, regression tests, and docs are all intertwined somewhat with the syntax, so I held off on that work until I got a little feedback. I will get to work and see if I can put together a more complete version in the next few days. If you happen to have time, you can see if you can break my current patch. I expect the basic algorithm to remain about the same for my next version, so if you see any problems with that, please let me know. Also, if you see any possible improvements that could make it useful for more situations, that would be helpful, too. But I think I have enough information to move forward, so if you want to move on to a more complete patch, feel free. Thanks for the review! Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thu, Jul 16, 2009 at 11:32 AM, Tom Lanet...@sss.pgh.pa.us wrote: I wrote: If I set both collapse_limit variables to very high values (I used 999), it takes ... um ... not sure; I gave up waiting after half an hour. I also tried with geqo_effort reduced to the minimum of 1, but that didn't produce a plan in reasonable time either (I gave up after ten minutes). After I gave up letting the machine be idle to get a fair timing, I turned on oprofile monitoring. It looks a bit interesting: That is interesting, but there's not really enough detail here to see what is going on. I'm more interested in what the high-level functions are doing that's causing these guys to be called so many times. As Greg says, if the planning time curve for GEQO isn't better than the one for the standard planner, it's the epitome of pointless. So maybe a redesign of the equivalence-class joinclause mechanism is in order. Still, this is unlikely to fix the fundamental issue that the time for large join problems grows nonlinearly. Nonlinear is one thing, but this looks more like exponential. I understand that the standard planner is exponential; GEQO should not be. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thursday 16 July 2009 19:13:55 Robert Haas wrote: On Thu, Jul 16, 2009 at 12:49 PM, Andres Freundand...@anarazel.de wrote: On Thursday 16 July 2009 17:59:58 Tom Lane wrote: Andres Freund and...@anarazel.de writes: The default settings currently make it relatively hard to trigger geqo at all. Yes, and that was intentional. One of the implications of what we're discussing here is that geqo would get used a lot more for typical complex queries (if there is any such thing as a typical one). So it's fully to be expected that the fallout would be pressure to improve geqo in various ways. Given that we are at the start of the development cycle, that prospect doesn't scare me --- there's plenty of time to fix whatever needs fixing. However, I am leaning to the feeling that I don't want to be putting people in a position where they have no alternative but to use geqo. So adjusting rather than removing the collapse limits is seeming like a good idea. Hm. I see a, a bit more fundamental problem with geqo: I tried several queries, and I found not a single one, where the whole genetical process did any significant improvments to the 'worth'. It seems that always the best variant out of the pool is either the path choosen in the end, or at least the cost difference is _really_ low. Ouch. Did you insert some debugging code to get that information, or how did you come to that conclusion? Yes, I enabled GEQO_DEBUG and added some more debugging output. Btw, a higher generation count does not change that. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
Robert Haas robertmh...@gmail.com writes: On Thu, Jul 16, 2009 at 11:32 AM, Tom Lanet...@sss.pgh.pa.us wrote: So maybe a redesign of the equivalence-class joinclause mechanism is in order. Still, this is unlikely to fix the fundamental issue that the time for large join problems grows nonlinearly. Nonlinear is one thing, but this looks more like exponential. I understand that the standard planner is exponential; GEQO should not be. Well, the equivclass code is new as of 8.3. It's possible that this got broken relatively recently ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thursday 16 July 2009 19:22:30 Robert Haas wrote: On Thu, Jul 16, 2009 at 11:32 AM, Tom Lanet...@sss.pgh.pa.us wrote: I wrote: If I set both collapse_limit variables to very high values (I used 999), it takes ... um ... not sure; I gave up waiting after half an hour. I also tried with geqo_effort reduced to the minimum of 1, but that didn't produce a plan in reasonable time either (I gave up after ten minutes). After I gave up letting the machine be idle to get a fair timing, I turned on oprofile monitoring. It looks a bit interesting: That is interesting, but there's not really enough detail here to see what is going on. I'm more interested in what the high-level functions are doing that's causing these guys to be called so many times. As Greg says, if the planning time curve for GEQO isn't better than the one for the standard planner, it's the epitome of pointless. It is not the actual genetic searching I now found out (or more precisely, read the trace correctly). At the start of the query GEQO fills a pool with random paths through the searchspace. Unfortunately a random path is not very likely to succeed. So it checks and checks and... Thats why that problem is not visible with a simple join out of 100 or so tables - all paths are valid there... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects
2009/7/16 KaiGai Kohei kai...@ak.jp.nec.com: However, I don't think the initial proposal of the largeobject security is now on the state to be reviewed seriously. OK, I am moving this patch to returned with feedback. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] plpythonu datatype conversion improvements
Sorry about that. Here it is again as an attachment. -Caleb On 7/16/09 7:16 AM, Peter Eisentraut pete...@gmx.net wrote: On Wednesday 27 May 2009 02:07:33 Caleb Welton wrote: Patch for plpythonu This patch doesn't apply; I think it got mangled during email transport. (Tabs changed to spaces, it looks like.) Could you resend the patch as a separate attachment in a way that it doesn't get mangled? plpython_bytea.patch Description: plpython_bytea.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Docbook toolchain interfering with patch review?
All, Well, after an hour of tinkering with docbook DTDs and openjade I've given up on building docs for the patch I was reviewing on my Mac. If I'm encountering this difficulty building docs, so are many of the other new patch reviewers. Which means we're *not* reviewing docs for completeness, correctness, or correspondence to the actual feature syntax until beta time. This seems like a serious issue for development. Reviewers, how many of you are able to build docs with each patch? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY WITH CSV FORCE QUOTE * -- REVIEW
All, 1) Patch applies cleanly against CVS head. 2) Patch compiles and builds cleanly. 3) Unable to check docs because of general doc build problems. 4) Tested the following commands, using a 10MB table of PostgreSQL log data: postgres=# COPY marchlog TO '/tmp/marchlog1.csv' with csv header; COPY 81097 postgres=# COPY marchlog TO '/tmp/marchlog2.csv' with csv header force quote *; COPY 81097 postgres=# COPY marchlog TO '/tmp/marchlog3.csv' with csv header force quote process_id; COPY 81097 postgres=# COPY marchlog TO '/tmp/marchlog4.csv' with csv force quote *; COPY 81097 postgres=# COPY marchlog TO '/tmp/marchlog5.csv' with force quote *; ERROR: COPY force quote available only in CSV mode STATEMENT: COPY marchlog TO '/tmp/marchlog5.csv' with force quote *; ERROR: COPY force quote available only in CSV mode postgres=# COPY reloadlog FROM '/tmp/marchlog2.csv' with csv header; COPY 81097 postgres-# \copy marchlog TO '/tmp/marchlog5.csv' with csv force quote *; postgres-# Per discussion, I did not test FORCE QUOTE NOT NULL *. All output looked as expected. This patch did not seem to change eariler functionality, and seems to quote as specified. Unless there are other things we want to test (CLOBs?) I think the patch is probably ready for code review of the FORCE QUOTE * portion. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_migrator not setting values of sequences?
Peter Eisentraut wrote: On Thursday 16 July 2009 07:09:22 Bruce Momjian wrote: Uh, how is this going to behave in 8.5? Do we still dump sequences, and if so, aren't we heading down the road of dumping stuff only because a previous release needed it? Which leads me to a related question: Do you plan to maintain one version of pg_migrator that can upgrade any version to any other version (within reason), or will there be separate binaries, say pg_migrator-8.4 and pg_migrator-8.5, that each can only upgrade from $selfversion-1 to $selfversion? One binary/source tree. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY WITH CSV FORCE QUOTE *
Andrew, FORCE NOT NULL is in any case a fairly blunt instrument - it doesn't work for a column of any type that doesn't accept an empty string as valid input, such as numeric types. Con: this allows COPY to produce output which cannot be reloaded into PostgreSQL. Pro: there is a lot of extremely broken external software which expects nulls to be expressed as . This improves compatiblity with them. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_stat_activity.application_name
On the admin list there was a request for an application name column in pg_stat_activity. http://archives.postgresql.org/pgsql-admin/2009-07/msg00095.php This is available in a lot of other DBMS products, can be useful to DBAs, and seems pretty cheap and easy. Could we get that onto the TODO list? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_activity.application_name
On Thu, Jul 16, 2009 at 2:08 PM, Kevin Grittnerkevin.gritt...@wicourts.gov wrote: On the admin list there was a request for an application name column in pg_stat_activity. http://archives.postgresql.org/pgsql-admin/2009-07/msg00095.php This is available in a lot of other DBMS products, can be useful to DBAs, and seems pretty cheap and easy. ah? how do you implement that? and what's the use case for? -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] navigation menu for documents
On Jul 14, 2009, at 3:21 PM, Andrew Dunstan wrote: Yes, really. What you suggest here is just not adequate, IMNSHO. I don't want to have to scroll to the top or bottom of the page to get navigation, and I want to be able to see the navigation and go where I want directly. Hey Andrew, Check out what I've done for the Bricolage documentation: http://www.bricolagecms.org/docs/devel/api/ I'm using jQuery to pull the proper doc into a div. I'm still noodling with it, trying to fix encoding issues on Windows, but it's pretty close to done. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synch Rep for CommitFest 2009-07
On Jul 16, 2009, at 11:09 AM, Greg Stark wrote: On Thu, Jul 16, 2009 at 4:41 PM, Heikki Linnakangasheikki.linnakan...@enterprisedb.com wrote: Rick Gigger wrote: If you use an rsync like algorithm for doing the base backups wouldn't that increase the size of the database for which it would still be practical to just re-sync? Couldn't you in fact sync a very large database if the amount of actual change in the files was a small percentage of the total size? It would certainly help to reduce the network traffic, though you'd still have to scan all the data to see what has changed. The fundamental problem with pushing users to start over with a new base backup is that there's no relationship between the size of the WAL and the size of the database. You can plausibly have a system with extremely high transaction rate generating WAL very quickly, but where the whole database fits in a few hundred megabytes. In that case you could be behind by only a few minutes and have it be faster to take a new base backup. Or you could have a petabyte database which is rarely updated. In which case it might be faster to apply weeks' worth of logs than to try to take a base backup. Only the sysadmin is actually going to know which makes more sense. Unless we start tieing WAL parameters to the database size or something like that. Once again wouldn't an rsync like algorithm help here. Couldn't you have the default be to just create a new base backup for them , but then allow you to specify an existing base backup if you've already got one? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY WITH CSV FORCE QUOTE *
Josh Berkus wrote: Andrew, FORCE NOT NULL is in any case a fairly blunt instrument - it doesn't work for a column of any type that doesn't accept an empty string as valid input, such as numeric types. Con: this allows COPY to produce output which cannot be reloaded into PostgreSQL. Pro: there is a lot of extremely broken external software which expects nulls to be expressed as . This improves compatiblity with them. FORCE NOT NULL is only valid when we import data, not when we export data, so what other programs expect to receive is irrelevant to any argument about FORCE NOT NULL. AFAICT on a brief look at the patch, it doesn't affect the quoting of nulls on export, it just allows * as an alias for all columns for FORCE QUOTE (as well as FORCE NOT NULL). But FORCE QUOTE has never forced quoting of null values, only non-null values. We have never quoted null values, and I'm fairly resistant to any suggestion that we should. As for importing data from programs that produce all values in quotes including null/missing values (your pro case above), arguably what we need is another flag that would turn an empty string into a null. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_activity.application_name
Jaime Casanova jcasa...@systemguards.com.ec wrote: Kevin Grittnerkevin.gritt...@wicourts.gov wrote: On the admin list there was a request for an application name column in pg_stat_activity. ah? how do you implement that? and what's the use case for? It would be passed as a connection property. (If that's not feasible, perhaps a session GUC, which could map to a client-side connection property is JDBC, etc.) For many environments, it is at least as important as an IP address or PID to help someone managing a database with a lot of connections. It would not do anything on the server except show up in pg_stat_activity as another piece of information about each connection. We would probably want to modify psql, pg_dump, etc. to put the application name into this connection property, at least by default. We may want to add a command-line switch to allow user override -- to provide something more detailed. For example, --application-name='Weekly Purge' could by specified on the psql command line. Yes, this is only as useful as the compliance of the client applications make it. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_activity.application_name
On Thu, Jul 16, 2009 at 8:08 PM, Kevin Grittnerkevin.gritt...@wicourts.gov wrote: On the admin list there was a request for an application name column in pg_stat_activity. http://archives.postgresql.org/pgsql-admin/2009-07/msg00095.php This is available in a lot of other DBMS products, can be useful to DBAs, and seems pretty cheap and easy. Could we get that onto the TODO list? I think you should just add it. Ok, we probably need some kind of policy for what to do before just adding things to the TODO but I think it should be relatively liberal. Something like, you should post that you're going to add it to the -hackers list, get at least one person agreeing with the item and no fatal flaws. Oh, and you should check for duplicates or for the same item on the things we don't want list. But if having done that you should assume it's up to you to just go ahead and add it. In this case I don't see any harm in having an opaque application identifier. Dangers (but surmountable ones I assume) would be: 1) The authenticity of the application identifier needs to be downplayed -- don't even think of using it for security for example. 2) encoding issues if different connections are in different encodings... 3) backwards compatibility both in the library api and protocol -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synch Rep for CommitFest 2009-07
On Thu, Jul 16, 2009 at 1:09 PM, Greg Starkgsst...@mit.edu wrote: On Thu, Jul 16, 2009 at 4:41 PM, Heikki Linnakangasheikki.linnakan...@enterprisedb.com wrote: Rick Gigger wrote: If you use an rsync like algorithm for doing the base backups wouldn't that increase the size of the database for which it would still be practical to just re-sync? Couldn't you in fact sync a very large database if the amount of actual change in the files was a small percentage of the total size? It would certainly help to reduce the network traffic, though you'd still have to scan all the data to see what has changed. The fundamental problem with pushing users to start over with a new base backup is that there's no relationship between the size of the WAL and the size of the database. You can plausibly have a system with extremely high transaction rate generating WAL very quickly, but where the whole database fits in a few hundred megabytes. In that case you could be behind by only a few minutes and have it be faster to take a new base backup. Or you could have a petabyte database which is rarely updated. In which case it might be faster to apply weeks' worth of logs than to try to take a base backup. Only the sysadmin is actually going to know which makes more sense. Unless we start tieing WAL parameters to the database size or something like that. I think we need a way for the master to know who its slaves are and keep any given bit of WAL available until all slaves have succesfully read it, just as we keep each WAL file until we successfully copy it to the archive. Otherwise, there's no way to be sure that a connection break won't result in the need for a new base backup. (In a way, a slave is very similar to an additional archive.) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] navigation menu for documents
David E. Wheeler wrote: On Jul 14, 2009, at 3:21 PM, Andrew Dunstan wrote: Yes, really. What you suggest here is just not adequate, IMNSHO. I don't want to have to scroll to the top or bottom of the page to get navigation, and I want to be able to see the navigation and go where I want directly. Hey Andrew, Check out what I've done for the Bricolage documentation: http://www.bricolagecms.org/docs/devel/api/ I'm using jQuery to pull the proper doc into a div. I'm still noodling with it, trying to fix encoding issues on Windows, but it's pretty close to done. Yes, that's nice, it's just the sort of thing I had in mind - if you can do it with a div instead of frames I'm fine with that. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY WITH CSV FORCE QUOTE * -- REVIEW
On Thu, Jul 16, 2009 at 2:47 PM, Josh Berkusj...@agliodbs.com wrote: Unless there are other things we want to test (CLOBs?) I think the patch is probably ready for code review of the FORCE QUOTE * portion. I think perhaps we should ask the patch author to remove the NOT NULL stuff first? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_activity.application_name
Greg Stark gsst...@mit.edu wrote: Kevin Grittnerkevin.gritt...@wicourts.gov wrote: On the admin list there was a request for an application name column in pg_stat_activity. http://archives.postgresql.org/pgsql-admin/2009-07/msg00095.php This is available in a lot of other DBMS products, can be useful to DBAs, and seems pretty cheap and easy. Could we get that onto the TODO list? I think you should just add it. Done. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY WITH CSV FORCE QUOTE * -- REVIEW
On 7/16/09 12:53 PM, Robert Haas wrote: On Thu, Jul 16, 2009 at 2:47 PM, Josh Berkusj...@agliodbs.com wrote: Unless there are other things we want to test (CLOBs?) I think the patch is probably ready for code review of the FORCE QUOTE * portion. I think perhaps we should ask the patch author to remove the NOT NULL stuff first? Yes, current status is Waiting on Author. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_activity.application_name
Kevin Grittner wrote: We would probably want to modify psql, pg_dump, etc. to put the application name into this connection property, at least by default. We may want to add a command-line switch to allow user override -- to provide something more detailed. For example, --application-name='Weekly Purge' could by specified on the psql command line. I've seen it set to argv[0] on other DBMSs, if not set explicitly. That would be a reasonable default, and would handle psql, pg_dump etc. without changes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering
On Thu, Jul 16, 2009 at 06:49:08PM +0200, Andres Freund wrote: On Thursday 16 July 2009 17:59:58 Tom Lane wrote: Andres Freund and...@anarazel.de writes: The default settings currently make it relatively hard to trigger geqo at all. Yes, and that was intentional. One of the implications of what we're discussing here is that geqo would get used a lot more for typical complex queries (if there is any such thing as a typical one). So it's fully to be expected that the fallout would be pressure to improve geqo in various ways. Given that we are at the start of the development cycle, that prospect doesn't scare me --- there's plenty of time to fix whatever needs fixing. However, I am leaning to the feeling that I don't want to be putting people in a position where they have no alternative but to use geqo. So adjusting rather than removing the collapse limits is seeming like a good idea. Hm. I see a, a bit more fundamental problem with geqo: I tried several queries, and I found not a single one, where the whole genetical process did any significant improvments to the 'worth'. It seems that always the best variant out of the pool is either the path choosen in the end, or at least the cost difference is _really_ low. Andres Hi Andres, From some of my reading of the literature on join order optimization via random sampling, such as what would establish the initial GEQO pool, there is a very good possibility of having a pretty good plan in the first pool, especially for our larger initial pool sizes of 100-1000. And in fact, the final plan has a good chance of being of approximately the same cost as a member of the initial pool. Uniform sampling alone can give you a close to optimum plan 80% of the time with an initial sample size of 100. And using biased sampling raises that to 99% or better. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY WITH CSV FORCE QUOTE *
Andrew, AFAICT on a brief look at the patch, it doesn't affect the quoting of nulls on export, it just allows * as an alias for all columns for FORCE QUOTE (as well as FORCE NOT NULL). But FORCE QUOTE has never forced quoting of null values, only non-null values. We have never quoted null values, and I'm fairly resistant to any suggestion that we should. See? That's what happens when I can't build the docs. ;-) (and there's no previous discussion of the feature). As for importing data from programs that produce all values in quotes including null/missing values (your pro case above), arguably what we need is another flag that would turn an empty string into a null. h, TODO, please? There's a lot of this out there, and I've had to build sed into a lot of import routines. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Docbook toolchain interfering with patch review?
On Thu, Jul 16, 2009 at 2:34 PM, Josh Berkusj...@agliodbs.com wrote: All, Well, after an hour of tinkering with docbook DTDs and openjade I've given up on building docs for the patch I was reviewing on my Mac. If I'm encountering this difficulty building docs, so are many of the other new patch reviewers. Which means we're *not* reviewing docs for completeness, correctness, or correspondence to the actual feature syntax until beta time. This seems like a serious issue for development. Reviewers, how many of you are able to build docs with each patch? Isn't it possible though to write and/or review the documentation patch without building it? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY WITH CSV FORCE QUOTE *
Josh Berkus wrote: Andrew, AFAICT on a brief look at the patch, it doesn't affect the quoting of nulls on export, it just allows * as an alias for all columns for FORCE QUOTE (as well as FORCE NOT NULL). But FORCE QUOTE has never forced quoting of null values, only non-null values. We have never quoted null values, and I'm fairly resistant to any suggestion that we should. See? That's what happens when I can't build the docs. ;-) (and there's no previous discussion of the feature). As for importing data from programs that produce all values in quotes including null/missing values (your pro case above), arguably what we need is another flag that would turn an empty string into a null. h, TODO, please? There's a lot of this out there, and I've had to build sed into a lot of import routines. +1 For that on the TODO, happens all the time... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for TODO Item: Add prompt escape to display the client and server versions
On Thursday 07 May 2009 05:23:41 Dickson S. Guedes wrote: This is a WIP patch (for the TODO item in the subject) that I'm putting in the Commit Fest queue for 8.5. The problem I'm seeing with this is that currently it resolves %v (client) = 8.5devel %V (server) = 8.5.0 Besides being inconsistent, it's also pretty misleading. I'd imagine a significant use of this feature would be to know that one is connected to, say, a beta version. More generally, does anyone actually need this feature? psql complains loudly enough if the version numbers are not the right ones. I don't know why this would need to be repeated in the prompt. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Docbook toolchain interfering with patch review?
On Thu, 16 Jul 2009, Josh Berkus wrote: Well, after an hour of tinkering with docbook DTDs and openjade I've given up on building docs for the patch I was reviewing on my Mac. It's easier to get the whole chain working under Linux, but even that isn't trivial. I think one useful step here would be to write up some practical docs on the package setup side here for various popular platforms on the wiki. I can probably find where I have the RedHat and Ubuntu recipies I use around here somewhere, to kick that off as part of the review I'm doing for the multi-threaded pgbench. It's been my experience that everybody runs into pretty much the same problems here getting standard, but said problems are unique to the OS. If someone write up something similar for OS X, so there's a recipe for getting the standard docs built on all the major development platforms where this could be straightforward (I shudder to think what a Cygwin guide would look like), that would make it much easier to push toward having more people do doc review. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY WITH CSV FORCE QUOTE *
Chris Spotts wrote: As for importing data from programs that produce all values in quotes including null/missing values (your pro case above), arguably what we need is another flag that would turn an empty string into a null. h, TODO, please? There's a lot of this out there, and I've had to build sed into a lot of import routines. +1 For that on the TODO, happens all the time... Well, somebody had better suggest a syntax for it, preferably without adding yet another keyword. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make planning via GEQO repeatable
Andres Freund and...@anarazel.de writes: Query planning via GEQO currently can yield a different plan on every invokation of the planner due to its non-exhaustive nature. This often can be inconvenient because at times there may be a very bad plan. It also makes it very hard to reproduce a problem with GEQO. Applied with some editorialization. Mainly, I didn't see the point of preserving the ability to have nondeterministic planning, and I especially didn't care for having that still be the default behavior. So I just made it unconditionally initialize the seed. It would of course take only minor tweaking to do things differently. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: support for multiplexing SIGUSR1
On Thu, Jul 16, 2009 at 2:57 AM, Jaime Casanovajcasa...@systemguards.com.ec wrote: Hi, I'm reviewing this patch: http://archives.postgresql.org/message-id/3f0b79eb0907022341m1d36a841x19c3e2a5a6906...@mail.gmail.com Another thing that took my attention, i don't think this is safe (it assumes only one auxiliary process of any type, don't know if we have various of the same kind but...): + /* +* Assign backend ID to auxiliary processes like backends, in order to +* allow multiplexing signal to auxiliary processes. Since backends use +* ID in the range from 1 to MaxBackends (inclusive), we assign +* auxiliary processes with MaxBackends + AuxProcType + 1 as an unique ID. +*/ + MyBackendId = MaxBackends + auxType + 1; + MyProc-backendId = MyBackendId; -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make planning via GEQO repeatable
On Thursday 16 July 2009 23:04:58 Tom Lane wrote: Andres Freund and...@anarazel.de writes: Query planning via GEQO currently can yield a different plan on every invokation of the planner due to its non-exhaustive nature. This often can be inconvenient because at times there may be a very bad plan. It also makes it very hard to reproduce a problem with GEQO. Applied with some editorialization. Mainly, I didn't see the point of preserving the ability to have nondeterministic planning, and I especially didn't care for having that still be the default behavior. So I just made it unconditionally initialize the seed. It would of course take only minor tweaking to do things differently. Nice. Mainly I did not have the guts to change the behaviour completely... archive.org has a copy of the dead link to the comp.ai.genetic FAQ linked at http://web.archive.org/web/20051226001402/http://www.cs.bham.ac.uk/Mirrors/ftp.de.uu.net/EC/clife/www/location.htm which is the same as the one referenced by alvaro in http://archives.postgresql.org/pgsql-docs/2009-07/msg4.php If considerered relevant enough, you can update the link... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
--On 16. Juli 2009 13:32:03 +0100 Grzegorz Jaskiewicz g...@pointblue.com.pl wrote: oh, another thing. stdbool is C99 standard feature. Not gcc extension. There might be compiler versions out there which claims to be C99 but do not provide full compliant include headers. SUN Studio 12 at least has the following in its documentation, as a quick research brings up: Though the compiler defaults to supporting the features of C99 listed below, standard headers provided by the Solaris software in /usr/include do not yet conform with the 1999 ISO/IEC C standard -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Docbook toolchain interfering with patch review?
Merlin Moncure escreveu: Isn't it possible though to write and/or review the documentation patch without building it? cd pgsql/doc/src/sgml gmake check -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] boolean in C
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Bernd Helmle Sent: Thursday, July 16, 2009 8:47 AM To: Grzegorz Jaskiewicz Cc: pgsql-hackers Hackers Subject: Re: [HACKERS] boolean in C --On 16. Juli 2009 13:32:03 +0100 Grzegorz Jaskiewicz g...@pointblue.com.pl wrote: oh, another thing. stdbool is C99 standard feature. Not gcc extension. There might be compiler versions out there which claims to be C99 but do not provide full compliant include headers. SUN Studio 12 at least has the following in its documentation, as a quick research brings up: Though the compiler defaults to supporting the features of C99 listed below, standard headers provided by the Solaris software in /usr/include do not yet conform with the 1999 ISO/IEC C standard It's more or less a generic problem. There is only a handful of fully functional C99 compilers[0], and all the others have Some c99 features to one degree or another. Microsoft's compiler is particularly abysmal, but then again, they have no claims of C99 compliance so there is nothing to complain about there. Those few features that they do implement are implemented in a non-standard way. GCC is also only partially compliant[1]. I believe that the Dinkum library is the only certified C99 standard library[2] as well. [0] see: http://www.peren.com/pages/branding_set.htm [1] see: http://gcc.gnu.org/c99status.html [2] see: http://www.dinkumware.com/manuals/ I don't think that a product (that is expected to run on as many platforms as PostgreSQL is expected to run on) is even possible to write in C99 code because there are not enough compliant compilers available. IMO-YMMV -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status report: getting plpgsql to use the core lexer
Tom Lane t...@sss.pgh.pa.us wrote: One problem that wasn't obvious when I started is that if you are trying to use a reentrant lexer, Bison insists on including its YYSTYPE union in the call signature of the lexer. Of course, YYSTYPE means different things to the core grammar and plpgsql's grammar. I tried to work around that by having an interface layer that would (among other duties) translate as needed. It turned out to be a real PITA, not least because you can't include both definitions into the same C file. The scheme I have has more or less failed --- I think I'd need *two* interface layers to make it work without unmaintainable kluges. It would probably be better to try to adjust the core lexer's API some more so that it does not depend on the core YYSTYPE, but I'm not sure yet how to get Bison to play along without injecting an interface layer (and hence wasted cycles) into the core grammar/lexer interface. Another pretty serious issue is that the current plpgsql lexer treats various sorts of qualified names as single tokens. I had thought this could be worked around in the interface layer by doing more lookahead. You can do that, and it mostly works, but it's mighty tedious. The big problem is that yytext gets out of step --- it will point at the last token the core lexer has processed, and there's no good way to back it up after lookahead. I spent a fair amount of time trying to work around that by eliminating uses of yytext in plpgsql, and mostly succeeded, but there are still some left. (Some of the remaining regression failures are error messages that point at the wrong token because they rely on yytext.) Now, having name lookup happen at the lexical level is pretty bogus anyhow. The long-term solution here is probably to avoid doing lookup in the plpgsql lexer and move it into some sort of callback hook in the main parser, as we've discussed before. I didn't want to get into that right away, but I'm now thinking it has to happen before not after refactoring the lexer code. One issue that has to be surmounted before that can happen is that plpgsql currently throws away all knowledge of syntactic scope after initial processing of a function --- the name stack is no longer available when we want to parse individual SQL commands. We can probably rearrange that design but it's another bit of work I don't have time for right now. All of this sounds pretty familiar to me. As you may recall, our framework includes a SQL parser which parses the subset of standard SQL we feel is portable enough, and generates Java classes to implement the code in lowest common denominator SQL with all procedural code for triggers and stored procedures handled in Java (which runs in our middle tier database service). We use ANTLR, and initially had a three-phase process: lexer, parser, and tree-walkers to generate code. We were doing way too much in the parser phase -- checking for table names, column names, data types, etc. The syntax of SQL forced us to do a lot of scanning forward and remembering where we were (especially to get the FROM clause information so we could process columns in the result list). We were able to get to much cleaner code by rewriting the parser to have a dumb phase to get the overall structure into an AST, and then use a tree-walker phase to do all the lookups and type resolution after we had the rough structure, writing another AST to walk for code generation. Besides making the code cleaner and easier to maintain, it helped us give better error messages pointing more accurately to the source of the problem. I don't know if a similar approach is feasible in flex/bison, but if it is, refactoring for an extra pass might be worth the trouble. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status report: getting plpgsql to use the core lexer
Kevin Grittner kevin.gritt...@wicourts.gov writes: ... We were able to get to much cleaner code by rewriting the parser to have a dumb phase to get the overall structure into an AST, and then use a tree-walker phase to do all the lookups and type resolution after we had the rough structure, writing another AST to walk for code generation. Besides making the code cleaner and easier to maintain, it helped us give better error messages pointing more accurately to the source of the problem. I don't know if a similar approach is feasible in flex/bison, but if it is, refactoring for an extra pass might be worth the trouble. That's actually what we have in the core parser. plpgsql is trying to take shortcuts, and this whole project is exactly about weaning it away from that. The bottom line is I tried to tackle the sub-projects in the wrong order... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Docbook toolchain interfering with patch review?
Greg Smith gsm...@gregsmith.com writes: On Thu, 16 Jul 2009, Josh Berkus wrote: Well, after an hour of tinkering with docbook DTDs and openjade I've given up on building docs for the patch I was reviewing on my Mac. It's easier to get the whole chain working under Linux, but even that isn't trivial. Really? It's just worked for me on the last several Fedora releases. You do need to install the docbook packages of course ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Docbook toolchain interfering with patch review?
Tom Lane wrote: Greg Smith gsm...@gregsmith.com writes: On Thu, 16 Jul 2009, Josh Berkus wrote: Well, after an hour of tinkering with docbook DTDs and openjade I've given up on building docs for the patch I was reviewing on my Mac. It's easier to get the whole chain working under Linux, but even that isn't trivial. Really? It's just worked for me on the last several Fedora releases. You do need to install the docbook packages of course ... Yes, that's my experience also. In any case, you really don't need to build the docs to read them. You might not like SGML, but it's not *that* hard to understand. Surely our patch reviewers can read the SGML text. Of course, we should check that the docs build cleanly after the patch is applied, but that's a different issue. As far as building goes, the CVS HEAD docs at http://developer.postgresql.org/pgdocs/postgres/index.html are rebuilt frequently, so we actually check as soon as the patch is applied. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
2009/7/17 Jeff Davis pg...@j-davis.com: Another idea that I thought about is that: ALTER TABLE foo ADD UNIQUE (a, b) USING foo_idx; could be a shorthand for: ALTER TABLE foo ADD INDEX CONSTRAINT (a =, b =) USING foo_idx; The benefit is that it could go over GiST indexes or hash indexes, not just btrees. The syntax could also be useful to turn an existing btree into a unique btree. I like that idea ... although how would this interact (if at all) with the existing pg_index.isunique flag? Would it become deprecated in favour of using indconstrats, or would you actually look at switching isunique to TRUE if somebody applies a constraint which is made up entirely of equality ops? Cheers, BJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Docbook toolchain interfering with patch review?
2009/7/17 Josh Berkus j...@agliodbs.com: This seems like a serious issue for development. Reviewers, how many of you are able to build docs with each patch? Being able to build docs did require some fidgeting with the docbook packages (on Gentoo). The only trick was working out exactly which packages I needed to install. Since getting past that, I've not had any problems building the docs. Although it is pretty slow. As Merlin and Andrew have noted, being able to build the docs is a nice-to-have for documentation review, not a genuine requirement. You *can* review changes to SGML right there in the diff. Especially if the changes are not extensive and/or don't alter the structure of the document. Cheers, BJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers