[HACKERS] proposal: Additional parameters for RAISE statement
Hello this proposal replace older unsuccessful proposal user exception - http://archives.postgresql.org/pgsql-hackers/2005-06/msg00683.php It allows only add more parameters to RAISE statement:: syntax: RAISE [NOTICE|WARNING|EXCEPTION] literal [, params] [WITH (eparam=expression, ...)]; possible exception params: sqlstate, detail, detail_log, hint. Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans (was: global prepared statements)
The hairiness is in the plan dependence (or independence) on parameter values, ideally we only want to cache plans that would be good for all parameter values, only the user knows that precisely. Although it could be possible to examine the column histograms... If cached plans would be implemented, the dependence on parameter values could be solved too: use special fork nodes in the plan which execute different sub-plans depending on special parameter values/ranges, possibly looking up the stats at runtime, so that the plan is in a compiled state with the decision points wired in. This of course would mean a lot heavier planning and possibly a lot bigger plans, but you could afford that if you cache the plan. You could even have a special command to plan a query this way. Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans (was: global prepared statements)
If cached plans would be implemented, the dependence on parameter values could be solved too: use special fork nodes in the plan which execute different sub-plans depending on special parameter values/ranges, possibly looking up the stats at runtime, so that the plan is in a compiled state with the decision points wired in. This of course would mean a lot heavier planning and possibly a lot bigger plans, but you could afford that if you cache the plan. You could even have a special command to plan a query this way. And, the fork node could mutter to itself Strange, I'm getting 1 rows instead of the 2 for which I was planned, perhaps I should switch to a different plan... I have made another very simple hack to test for another option : Bind message behaviour was modified : - If the user asks for execution of a named prepared statement, and the named statement does not exist in PG's prepared statements cache, instead of issuing an error and borking the transaction, it Binds to an empty statement, that takes no parameters, and returns no result. Parameters sent by the user are consumed but not used. The application was modified thusly : - Calls to pg_query_params were changed to calls to the following function : function pg_query_cached( $sql, $params ) { // Try to execute it, using the query string as statement name. $q = pg_execute( $sql, $params ); if( !$q ) die( pg_last_error() ); // If it worked, return result to caller. if( pg_result_status( $q, PGSQL_STATUS_STRING ) != ) return $q; // If we got an empty query result (not a result with 0 rows which is valid) then prepare the query $q = pg_prepare( $sql, $sql ); if( !$q ) die( pg_last_error() ); // and execute it again $q = pg_execute( $sql, $params ); if( !$q ) die( pg_last_error() ); return $q; } Pros : - It works - It is very very simple - The user can choose between caching plans or not by calling pg_query_params() (no cached plans) or pg_query_cached() (cached plans) - It works with persistent connections Cons : - It is too simple - Plans are cached locally, so memory use is proportional to number of connections - It is still vulnerable to search_path problems -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Pljava-dev] stack depth limit exceeded - patch possible?
Dear Kris, am I understanding this correctly that pl/java sets it for the main Java thread, so other threads spawned by this main thread and using postgres SPI functionality will run into stack_depth_problems? I have read only access in this application, so maybe my envisioned patched version (check_stack_depth doing nothing) will work for my proof of concept tests. Can you suggest another workaround? Regards, Alexander Wöhrer On Sat, 12 Apr 2008, Alexander Wöhrer wrote: I'm working on Windows XP SP2 (stack limit 3500 kb) and deployed successfully my application (doing some external Web service calling) inside PostGre 8.3.0. Unfortunatelly, the application needs at least 3 Threads and will run for quite some time. I found this comment http://pgfoundry.org/pipermail/pljava-dev/2005/000491.html by Thomas Hallgren where he mentioned that PostGre only defines one stack and therefor pl/java has no way of telling PostGre about multiple thread stack pointers. My question is now if there is a patched version available of PostGre 8.3.0 having this stack_depth check disabled? This was fixed in postgresql/pljava shortly after the referenced discussion. As requested, postgresql 8.1+ allows modification of stack_base_ptr so pljava can set it as desired. Kris Jurka -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index AM change proposals, redux
Heikki Linnakangas wrote: Ron Mayer wrote: One use case that I think GIT would help a lot with are my large address tables that are clustered by zip-code but often queried by State, City, County, School District, Police Beat, etc. I imagine a GIT index on state would just occupy a couple pages at most regardless of how large the table gets. .. Not quite that much, though. GIT still stores one index pointer per heap page even on a fully clustered table. Otherwise it's not much good for searches. Then I wonder if I can conceive of yet another related index type that'd be useful for such clustered tables. If I had something like GIT that stored something like values State='CA' can be found on pages 1000 through 1 and 2 through 21000 would it be even more effective on such a table than GIT? If so, it seems it'd give many advantages that partitioning by state could give (at least for querying). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans (was: global prepared statements)
Csaba Nagy wrote: If cached plans would be implemented, the dependence on parameter values could be solved too: use special fork nodes in the plan which execute different sub-plans depending on special parameter values/ranges, possibly looking up the stats at runtime, so that the plan is in a compiled state with the decision points wired in. That's an idea I've been thinking about for a long time, but never got around implementing. I see that as a completely orthogonal feature to the server-side shared plan cache, though. There's plenty of scenarios, like with client-side prepared statement cache, where it would be useful. Figuring out the optimal decision points is hard, and potentially very expensive. There is one pretty simple scenario though: enabling the use of partial indexes, preparing one plan where a partial index can be used, and another one where it can't. Another such case is col LIKE ? queries, where ? is actually a prefix query, foo%. As an optimization, we could decide the decision points on the prepare message, and delay actually planning the queries until they're needed. That way we wouldn't waste time planning queries for combinations of parameters that are never used. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans (was: global prepared statements)
PFC wrote: Bind message behaviour was modified : - If the user asks for execution of a named prepared statement, and the named statement does not exist in PG's prepared statements cache, instead of issuing an error and borking the transaction, it Binds to an empty statement, that takes no parameters, and returns no result. Parameters sent by the user are consumed but not used. You mentioned the need for a wire protocol change to allow this. Why can't this be controlled with a server variable, like SET auto_prepare = 'true'? -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index AM change proposals, redux
Ron Mayer wrote: Then I wonder if I can conceive of yet another related index type that'd be useful for such clustered tables. If I had something like GIT that stored something like values State='CA' can be found on pages 1000 through 1 and 2 through 21000 would it be even more effective on such a table than GIT? Yep, a bitmap index. Bitmap indexes don't actually work quite like that, but the point is that it is very efficient on a column with few distinct values, and even more so if the table is clustered on that column. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans (was: global prepared statements)
On Mon, 2008-04-14 at 16:54 +0300, Heikki Linnakangas wrote: Figuring out the optimal decision points is hard, and potentially very expensive. There is one pretty simple scenario though: enabling the use of partial indexes, preparing one plan where a partial index can be used, and another one where it can't. Another such case is col LIKE ? queries, where ? is actually a prefix query, foo%. Another point is when the cardinality distribution of some key's values is very skewed, with some values very frequent and the majority of values being unique. There you could check the stats at execution time just for deciding to go for the low cardinality plan or the high one... As an optimization, we could decide the decision points on the prepare message, and delay actually planning the queries until they're needed. That way we wouldn't waste time planning queries for combinations of parameters that are never used. ... or plan the query with the actual parameter value you get, and also record the range of the parameter values you expect the plan to be valid for. If at execution time the parameter happens to be out of that range, replan, and possibly add new sublpan covering the extra range. This could still work with prepared queries (where you don't get any parameter values to start with) by estimating the most probable parameter range (whatever that could mean), and planning for that. Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans (was: global prepared statements)
On Mon, 2008-04-14 at 16:10 +0200, Csaba Nagy wrote: ... or plan the query with the actual parameter value you get, and also record the range of the parameter values you expect the plan to be valid for. If at execution time the parameter happens to be out of that range, replan, and possibly add new sublpan covering the extra range. This could still work with prepared queries (where you don't get any parameter values to start with) by estimating the most probable parameter range (whatever that could mean), and planning for that. More on that: recording the presumptions under which the (cached!)plan is thought to be valid would also facilitate setting up dependencies against statistics, to be checked when you analyze tables... and if the key value which you depend on with your query changed, the analyze process could possibly replan it in the background. Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans (was: global prepared statements)
... or plan the query with the actual parameter value you get, and also record the range of the parameter values you expect the plan to be valid for. If at execution time the parameter happens to be out of that range, replan, and possibly add new sublpan covering the extra range. This could still work with prepared queries (where you don't get any parameter values to start with) by estimating the most probable parameter range (whatever that could mean), and planning for that. Another thought: if the cached plans get their own table (as it was suggested) then you could also start gathering parameter range statistics meaningfully... and on the next replan you know what to optimize your planning efforts for. Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Race conditions in relcache load (again)
Awhile back we did some significant rejiggering to ensure that no relcache load would be attempted without holding at least AccessShareLock on the relation. (Otherwise, if someone else is in process of making an update to one of the system catalog rows defining the relation, there's a race condition for SnapshotNow scans: the new row version might not be committed when you scan it, and if you come to the old row version second, it could be committed dead by the time you scan it, and then you don't see the row at all.) While thinking about Darren Reed's repeat trouble report http://archives.postgresql.org/pgsql-admin/2008-04/msg00113.php I realized that we failed to plug all the gaps of this type, because relcache.c contains *internal* cache load/reload operations that aren't protected. In particular the LOAD_CRIT_INDEX macro calls invoke relcache load on indexes that aren't locked. So they'd be at risk from a concurrent REINDEX or similar on those system indexes. RelationReloadIndexInfo seems at risk as well. AFAICS this doesn't explain Darren's problem because it would only be a transient failure at the instant of committing the REINDEX; and whatever he's being burnt by has persistent effects. Nonetheless it sure looks like a bug. Anyone think it isn't necessary to lock the target relation here? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans (was: global prepared statements)
Bind message behaviour was modified : - If the user asks for execution of a named prepared statement, and the named statement does not exist in PG's prepared statements cache, instead of issuing an error and borking the transaction, it Binds to an empty statement, that takes no parameters, and returns no result. Parameters sent by the user are consumed but not used. You mentioned the need for a wire protocol change to allow this. Why can't this be controlled with a server variable, like SET auto_prepare = 'true'? Actually, thanks to the hack, the wire protocol doesn't change. Explanation : - Send Parse(SQL) to unnamed statement + Bind unnamed statement = works as usual (no cache) - Send only Bind (named statement) with a statement name that is not found in the cache = doesn't raise an error, instead informs the application that the statement does not exist. The application can then prepare (send a Parse message with SQL and a name) the statement and give it a name. I used as name the SQL itself, but you can use anything else. The application can then send the Bind again, which will (hopefully) work. So, here, the information (cache or don't cache) is passed from the client to the server, in a hidden way : it depends on what function you use to send the query (unnamed statements are not cached, named statements are cached). There is no protocol change, but a new information is provided to the server nonetheless. Downside to this is that the application needs to be modified (only a little, though) and applications that expect exceptions on Statement does not exist will break, thus the necessity of a GUC to control it. It was just a quick dirty test to see if this way of doing it was an option to consider or not. Apparently it works, but wether it is The Right Way remains to be seen... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans
I like cross-session query plan caching talk. I would prefer if the function was optional (i.e. per-session use cross-session query plan cache variable). I like the automatic re-plan if the estimate did not match the actual idea with some softening technique involved such as if the last 3 times it ran, it did the wrong thing, learn from our mistake and adapt. The other ideas about automatically deciding between plans based on ranges and such strike me as involving enough complexity and logic, that to do properly, it might as well be completely re-planned from the beginning to get the most benefit. Cheers, mark -- Mark Mielke [EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans
On Mon, 2008-04-14 at 10:55 -0400, Mark Mielke wrote: The other ideas about automatically deciding between plans based on ranges and such strike me as involving enough complexity and logic, that to do properly, it might as well be completely re-planned from the beginning to get the most benefit. ... except if you hard-wire the most common alternative plans, you still get the benefit of cached plan for a wider range of parameter values. Not to mention that if you know you'll cache the plan, you can try harder planning it right, getting possibly better plans for complex queries... you could argue that complex queries tend not to be repeated, but we do have here some which are in fact repeated a lot in batches, then discarded. So I guess a cached plan discard/timeout mechanism would also be nice. Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans (was: global prepared statements)
On Mon, 14 Apr 2008 16:17:18 +0200, Csaba Nagy [EMAIL PROTECTED] wrote: On Mon, 2008-04-14 at 16:10 +0200, Csaba Nagy wrote: ... or plan the query with the actual parameter value you get, and also record the range of the parameter values you expect the plan to be valid for. If at execution time the parameter happens to be out of that range, replan, and possibly add new sublpan covering the extra range. This could still work with prepared queries (where you don't get any parameter values to start with) by estimating the most probable parameter range (whatever that could mean), and planning for that. More on that: recording the presumptions under which the (cached!)plan is thought to be valid would also facilitate setting up dependencies against statistics, to be checked when you analyze tables... and if the key value which you depend on with your query changed, the analyze process could possibly replan it in the background. LOL, it started with the idea to make small queries faster, and now the brain juice is pouring. Those Decision nodes could potentially lead to lots of decisions (ahem). What if you have 10 conditions in the Where, plus some joined ones ? That would make lots of possibilities... Consider several types of queries : - The small, quick query which returns one or a few rows : in this case, planning overhead is large relative to execution time, but I would venture to guess that the plans always end up being the same. - The query that takes a while : in this case, planning overhead is nil compared to execution time, better replan every time with the params. - The complex query that still executes fast because it doesn't process a lot of rows and postgres finds a good plan (for instance, a well optimized search query). Those would benefit from reducing the planning overhead, but those also typically end up having many different plans depending on the search parameters. Besides, those queries are likely to be dynamically generated. So, would it be worth it to add all those features just to optimize those ? I don't know... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_dump object sorting
I have been looking at refining the sorting of objects in pg_dump to make it take advantage of buffering and synchronised scanning, and possibly make parallel restoration simpler and more efficient. My first thought was to sort indexes by namespace, tablename, indexname instead of by namespace, indexname. However, that doesn't go far enough, I think. Is there any reason we can't do all of a table's indexes and non-FK constraints together? Will that affect anything other than PK and UNIQUE constraints, as NULL and CHECK constraints are included in table definitions? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Pljava-dev] stack depth limit exceeded - patch possible?
On Mon, 14 Apr 2008, Alexander Wöhrer wrote: am I understanding this correctly that pl/java sets it for the main Java thread, so other threads spawned by this main thread and using postgres SPI functionality will run into stack_depth_problems? pljava sets the stack_base_ptr for each thread just before it calls into the backend using SPI and resets it when that thread finishes using SPI. Only one thread can access the backend at a time, so multi-threaded pljava code is safe and this mangling of the stack_base_ptr keeps the backend happy. Can you suggest another workaround? Are you having any actual problems or is this all theoretical? I don't believe you should be having any issues, but if you're having a real problem, please post a self-contained test case so we can look into it. Kris Jurka -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans (was: global prepared statements)
On Mon, 2008-04-14 at 17:08 +0200, PFC wrote: Those Decision nodes could potentially lead to lots of decisions (ahem). What if you have 10 conditions in the Where, plus some joined ones ? That would make lots of possibilities... Yes, that's true, but most of them are likely not relevant for the end result. In any real life query there are a few parameters which are really important for what plan you should choose... the key here is that you should spend more time on finding the possibilities for a cached plan than you do for a one shot query. In principle one-shot planning should be the default and caching should be something the user has to chose deliberately. I would really like a special command to plan and cache a query without actually executing it, possibly having a parameter how hard to try... for e.g. you could expend the extra cycles to eliminate all redundancies from boolean expressions, in lists, to get the parse tree in a canonical format - all things which can make planning easier. All these lose in one-shot queries, but once you cache you can really do a lot of smarts which were no-no before... Consider several types of queries : - The small, quick query which returns one or a few rows : in this case, planning overhead is large relative to execution time, but I would venture to guess that the plans always end up being the same. Consider a 'select a where b like $1' - the parameter $1 will considerably affect the query plan. A query can't go much simpler... - The query that takes a while : in this case, planning overhead is nil compared to execution time, better replan every time with the params. I guess these queries are not the ones targeted by this feature. In fact for these queries it really doesn't matter if you cache or not, except: if you know you're gonna cache, you'll expend more effort planning right, and that could still matter for a query which runs long. Note that if you don't cache, planning harder will lose in the long run, only once you cache you can afford to plan harder... - The complex query that still executes fast because it doesn't process a lot of rows and postgres finds a good plan (for instance, a well optimized search query). Those would benefit from reducing the planning overhead, but those also typically end up having many different plans depending on the search parameters. Besides, those queries are likely to be dynamically generated. So, would it be worth it to add all those features just to optimize those ? I don't know... We have here dynamically generated queries which are specifically chunked to be executed in small increments so none of the queries runs too long (they would block vacuuming vital tables otherwise). Those chunks would greatly benefit from properly planned and cached plans... A real smart system would store canonical plan fragments as response to (also canonicalized) parse tree fragments, and then assemble the plan out of those fragments, but that would be indeed really complex (to design, the resulting code might be simpler than one thinks) ;-) Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Lessons from commit fest
There has been talk of the lessons we learned during this commit fest, but exactly what lessons did we learn? I am not clear on that so I assume others are not as well. What are we going to do differently during the next commit fest? -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump object sorting
On Mon, 2008-04-14 at 11:18 -0400, Andrew Dunstan wrote: I have been looking at refining the sorting of objects in pg_dump to make it take advantage of buffering and synchronised scanning, and possibly make parallel restoration simpler and more efficient. Synchronized scanning is explicitly disabled in pg_dump. That was a last-minute change to answer Greg Stark's complaint about dumping a clustered table: http://archives.postgresql.org/pgsql-hackers/2008-01/msg00987.php That hopefully won't be a permanent solution, because I think synchronized scans are useful for pg_dump. However, I'm not clear on how the pg_dump order would be able to better take advantage of synchronized scans anyway. What did you have in mind? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Bruce Momjian wrote: There has been talk of the lessons we learned during this commit fest, but exactly what lessons did we learn? I am not clear on that so I assume others are not as well. What are we going to do differently during the next commit fest? As far as the Wiki page is concerned, it would be good to make sure the entries have a bit more info than just a header line -- things such as author, who reviewed and what did the reviewer say about it. Some of it is already there. Something else we learned is that the archives are central (well, we already knew that, but I don't think we had ever given them so broad use), and we've been making changes to them so that they are more useful to reviewers. Further changes are still needed on them, of course, to address the remaining problems. Lastly, I would say that pushing submitters to enter their sent patches into the Wiki worked -- we need to ensure that they keep doing it. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump object sorting
Jeff Davis wrote: On Mon, 2008-04-14 at 11:18 -0400, Andrew Dunstan wrote: I have been looking at refining the sorting of objects in pg_dump to make it take advantage of buffering and synchronised scanning, and possibly make parallel restoration simpler and more efficient. Synchronized scanning is explicitly disabled in pg_dump. That was a last-minute change to answer Greg Stark's complaint about dumping a clustered table: http://archives.postgresql.org/pgsql-hackers/2008-01/msg00987.php That hopefully won't be a permanent solution, because I think synchronized scans are useful for pg_dump. However, I'm not clear on how the pg_dump order would be able to better take advantage of synchronized scans anyway. What did you have in mind? I should have expressed it better. The idea is to have pg_dump emit the objects in an order that allows the restore to take advantage of sync scans. So sync scans being disabled in pg_dump would not at all matter. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, Apr 15, 2008 at 2:45 AM, Alvaro Herrera wrote: As far as the Wiki page is concerned, it would be good to make sure the entries have a bit more info than just a header line -- things such as author, who reviewed and what did the reviewer say about it. In a wiki context, this sort of thing has got Template written all over it. I've done a little editing on Wikipedia, so I've got some idea about how to make wiki Templates work. I'm not claiming to be an expert, but if nobody else has a particular yen for it, I can have a go at setting up some simple templates to make it easier to add a patch or a review in a structured way. Lastly, I would say that pushing submitters to enter their sent patches into the Wiki worked -- we need to ensure that they keep doing it. +1. Although I wouldn't say that there has yet been a push for submitters to enter their patches into the wiki =) I've started adding my own patches to the wiki recently. The only thing about the process that sucks is that I need a URL linking to the message in the archives. I naturally want to add my patch to the wiki immediately after sending my email to -patches, and it takes some material interval of time for messages to show up on the archive. My solution was to just pull the message ID out of the headers in gmail and fudge the URL. So the URL I add to the wiki is actually borked until the archives refresh themselves, which is less than awesome ... Apart from the archive linkage thing, I found the process of queueing my own patches smooth, straightforward and satisfying. I would recommend it to other submitters, if for no other reason than to reduce the amount of drudge work the core team has to do to keep things in order. Cheers, BJ -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.7 (GNU/Linux) Comment: http://getfiregpg.org iD8DBQFIA5Jl5YBsbHkuyV0RAnL0AJ97+ZdXr71Xu6wMSlVGSvy1t4WqbgCgz58X GHMaO7j4g8+WTmcNx2SKBnA= =xPJ3 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remove lossy-operator RECHECK flag?
I've committed the runtime-recheck changes. Oleg had mentioned that GIST text search could be improved by using runtime rechecking, but I'll leave any refinements of that sort up to you. One thing I was wondering about is that GIN and GIST are set up to preinitialize the recheck flag to TRUE; this means that if someone uses an old consistent() function that doesn't know it should set the flag, a recheck will be forced. But it seems to me that there's an argument for preinitializing to FALSE instead. There are four possibilities for what will happen with an un-updated consistent() function: 1. If we set the flag TRUE, and that's correct, everything is fine. 2. If we set the flag TRUE, and that's wrong (ie, the query is really exact) then a useless recheck occurs when we arrive at the heap. Nothing visibly goes wrong, but the query is slower than it should be. 3. If we set the flag FALSE, and that's correct, everything is fine. 4. If we set the flag FALSE, and that's wrong (ie, the query is really inexact), then rows that don't match the query may get returned. By the argument that it's better to break things obviously than to break them subtly, risking case 4 seems more attractive than risking case 2. This also ties into my previous question about what 8.4 pg_dump should do when seeing amopreqcheck = TRUE while dumping from an old server. I'm now convinced that the committed behavior (print RECHECK anyway) is the best choice, for a couple of reasons: * It avoids silent breakage if the dump is reloaded into an old server. * You'll have to deal with the issue anyhow if you made your dump with the older version's pg_dump. What this means is that, if we make the preinitialization value FALSE, then an existing GIST/GIN opclass that doesn't use RECHECK will load just fine into 8.4 and everything will work as expected, even without touching the C code. An opclass that does use RECHECK will fail to load from the dump, and if you're stubborn and edit the dump instead of getting a newer version of the module, you'll start getting wrong query answers. This means that all the pain is concentrated on the RECHECK-using case. And you can hardly maintain that you weren't warned about compatibility problems, if the dump didn't load ... On the other hand, if we make the preinitialization value TRUE, there's some pain for users whether they used RECHECK or not, and there won't be any obvious notification of the problem when they didn't. So I'm thinking it might be better to switch to the other preinitialization setting. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Brendan Jurd escribió: On Tue, Apr 15, 2008 at 2:45 AM, Alvaro Herrera wrote: As far as the Wiki page is concerned, it would be good to make sure the entries have a bit more info than just a header line -- things such as author, who reviewed and what did the reviewer say about it. In a wiki context, this sort of thing has got Template written all over it. I've done a little editing on Wikipedia, so I've got some idea about how to make wiki Templates work. I'm not claiming to be an expert, but if nobody else has a particular yen for it, I can have a go at setting up some simple templates to make it easier to add a patch or a review in a structured way. Please have a go at it in a test page. The main principle we need is that the thing must be as easy as possible to edit. (FWIW if you can come up with a better markup than what we currently use, we'd welcome the advise.) Lastly, I would say that pushing submitters to enter their sent patches into the Wiki worked -- we need to ensure that they keep doing it. +1. Although I wouldn't say that there has yet been a push for submitters to enter their patches into the wiki =) Well, I pushed some authors via private email. Others did not seem to need any pushing :-) I've started adding my own patches to the wiki recently. The only thing about the process that sucks is that I need a URL linking to the message in the archives. I naturally want to add my patch to the wiki immediately after sending my email to -patches, and it takes some material interval of time for messages to show up on the archive. My solution was to just pull the message ID out of the headers in gmail and fudge the URL. Right, that's the way I use them too. Sorry we don't have anything better :-( The archives are refreshed every 10 minutes, so it's not like you need to wait for a week, either. Of course, I've configured my email client so that Message-Ids are shown all over the place, so I don't have to mess around clicking stuff. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Brendan Jurd wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, Apr 15, 2008 at 2:45 AM, Alvaro Herrera wrote: As far as the Wiki page is concerned, it would be good to make sure the entries have a bit more info than just a header line -- things such as author, who reviewed and what did the reviewer say about it. In a wiki context, this sort of thing has got Template written all over it. I've done a little editing on Wikipedia, so I've got some idea about how to make wiki Templates work. I'm not claiming to be an expert, but if nobody else has a particular yen for it, I can have a go at setting up some simple templates to make it easier to add a patch or a review in a structured way. One problem I saw is that people commenting in the wiki sometimes didn't leave their names. It would be nice if that could be improved. Lastly, I would say that pushing submitters to enter their sent patches into the Wiki worked -- we need to ensure that they keep doing it. +1. Although I wouldn't say that there has yet been a push for submitters to enter their patches into the wiki =) I've started adding my own patches to the wiki recently. The only thing about the process that sucks is that I need a URL linking to the message in the archives. I naturally want to add my patch to the wiki immediately after sending my email to -patches, and it takes some material interval of time for messages to show up on the archive. My solution was to just pull the message ID out of the headers in gmail and fudge the URL. So the URL I add to the wiki is actually borked until the archives refresh themselves, which is less than awesome ... Yea, I have to wait to add URLs to the TODO list too, but as you mentioned I can now use message-id URLs, though I try to avoid them beccause they aren't as attractive --- doesn't matter for the wiki. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump object sorting
Andrew Dunstan [EMAIL PROTECTED] writes: I should have expressed it better. The idea is to have pg_dump emit the objects in an order that allows the restore to take advantage of sync scans. So sync scans being disabled in pg_dump would not at all matter. Unless you do something to explicitly parallelize the operations, how will a different ordering improve matters? I thought we had a paper design for this, and it involved teaching pg_restore how to use multiple connections. In that context it's entirely up to pg_restore to manage the ordering and ensure dependencies are met. So I'm not seeing how it helps to have a different sort rule at pg_dump time --- it won't really make pg_restore's task any easier. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Alvaro Herrera [EMAIL PROTECTED] writes: As far as the Wiki page is concerned, it would be good to make sure the entries have a bit more info than just a header line -- things such as author, who reviewed and what did the reviewer say about it. I think it'd be easy to go overboard there. One thing we learned from Bruce's page is that a display with ten or more lines per work item is not very helpful ... you start wishing you had a summary, and then the whole design cycle repeats. We should not try to make the wiki page be a substitute for reading the linked-to discussions. (This is one of the reasons that dropped links in the archives, such as the month-end problem, are so nasty.) One idea is to make more effective use of horizontal space than we were doing with the current wiki-page markup. I'd have no objection to including the author's name and a status indicator if it all fit on the same line as the patch title/link. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remove lossy-operator RECHECK flag?
2. If we set the flag TRUE, and that's wrong (ie, the query is really exact) then a useless recheck occurs when we arrive at the heap. Nothing visibly goes wrong, but the query is slower than it should be. 4. If we set the flag FALSE, and that's wrong (ie, the query is really inexact), then rows that don't match the query may get returned. By the argument that it's better to break things obviously than to break them subtly, risking case 4 seems more attractive than risking case 2. The single thought is: usually, it's very hard to see that query returns more results that it should be. It doesn't matter for fulltext search (and it has very good chance to stay unnoticed forever because wrong rows will be sorted down by ranking function, although performance will decrease. But text search is now built-in :-) ), but for other modules it may be critical, especially when content of db depends on result of such query. It seems to me, there was the bug in btree at one time - it doesn't keep uniqueness and some values was duplicated. User noticed that only during restoring of db. What this means is that, if we make the preinitialization value FALSE, then an existing GIST/GIN opclass that doesn't use RECHECK will load just fine into 8.4 and everything will work as expected, even without touching the C code. Yes. An opclass that does use RECHECK will fail to load from the dump, and if you're stubborn and edit the dump instead of getting a newer version of the module, you'll start getting wrong query answers. This means that all the pain is concentrated on the RECHECK-using case. And you can hardly maintain that you weren't I don't think that restoration of dump from old versions with modules is good practice anyway and saved RECHECK will signal about problem earlier. If user edits dump to remove RECHECK flag then he is an enemy to himself. So I'm thinking it might be better to switch to the other preinitialization setting. Comments? Agreed. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remove lossy-operator RECHECK flag?
Teodor Sigaev [EMAIL PROTECTED] writes: By the argument that it's better to break things obviously than to break them subtly, risking case 4 seems more attractive than risking case 2. The single thought is: usually, it's very hard to see that query returns more results that it should be. It doesn't matter for fulltext search (and it has very good chance to stay unnoticed forever because wrong rows will be sorted down by ranking function, although performance will decrease. Hmm ... that's a good point. And the performance loss that I'm complaining about is probably not large, unless you've got a *really* expensive operator. Maybe we should leave it as-is. Anybody else have an opinion? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Tom Lane wrote: A smaller lesson was that you can't start fest without a queue of ready-to-work-on patches. We seem to be evolving towards a plan where stuff gets dumped onto the wiki page more or less immediately as it comes in. That should take care of that problem, though I'd still like to see someone accept responsibility for making sure patches get listed whether or not their author does it. I'm on it. If someone wants to act as backup, he or she is more than welcome. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, Apr 15, 2008 at 4:12 AM, Tom Lane wrote: Alvaro Herrera writes: As far as the Wiki page is concerned, it would be good to make sure the entries have a bit more info than just a header line -- things such as author, who reviewed and what did the reviewer say about it. We should not try to make the wiki page be a substitute for reading the linked-to discussions. (This is one of the reasons that dropped links in the archives, such as the month-end problem, are so nasty.) One idea is to make more effective use of horizontal space than we were doing with the current wiki-page markup. I'd have no objection to including the author's name and a status indicator if it all fit on the same line as the patch title/link. When you say horizontal space, I start thinking in terms of a table, like the ancestor of the current wiki commitfest queue, ye olde http://developer.postgresql.org/index.php/Todo:PatchStatus However, the table format doesn't provide well for brief review comments, such as we currently have for the 64bit pass-by-value patch in the May CommitFest page. Personally I think the review addenda are nice. As Tom says, it shouldn't be a replacement for actually reading the thread, but it helps to get people up to speed on the status of the patch. I think it's a useful primer for those who may be interested in looking at the patch in more detail. Anyway, take a look at http://wiki.postgresql.org/wiki/Template:Patch, and http://wiki.postgresql.org/wiki/User_talk:Direvus for an example of how it is used. This is my first stab at writing a patch queue entry template. It uses a similar structure to what's already on the CommitFest pages. To make this work, all a patch submitter needs to do is go to the relevant CommitFest page and add {{patch|||}} to the Pending Patches section. If you guys find this useful, I'd be happy to add a review template in a similar vein. Cheers, BJ -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.7 (GNU/Linux) Comment: http://getfiregpg.org iD8DBQFIA6eX5YBsbHkuyV0RAkzNAKDEbtZZOyUI1olsErxgp1o39dH3VQCfcBQN b25k/nEy7qupYsthhPtU7eQ= =uu+E -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Bruce Momjian [EMAIL PROTECTED] writes: There has been talk of the lessons we learned during this commit fest, but exactly what lessons did we learn? Actually, the *main* lesson we learned was don't start with a 2000-email inbox. There were a couple of reasons that the queue was so forbidding: 1. We were in feature freeze for 11 months and consequently built up a pretty sizable backlog of stuff that had been postponed to 8.4. We have to avoid ever doing that again. We've already made some process changes to try to avoid getting stuck that way, and we have to be willing to change some more if the current plan doesn't work. But that wasn't a lesson of the commit fest, we already knew it was broken :-(. This was just inevitable pain from our poor management of the last release cycle. 2. A whole lot of the 2000 emails were not actually about reviewable patches. I'm willing to take most of the blame here --- I pushed Bruce to publish the list before he'd finished doing as much clean-up as he wanted, and I also encouraged him to leave in some long design discussion threads that seemed to me to warrant more discussion. (And part of the reason I thought so was that I'd deliberately ignored those same threads when they were active, because I was busy trying to get 8.3 out the door; so again this was partly delayed pain from the 8.3 mess.) In hindsight we didn't get very much design discussion done during the fest, and I think it's unlikely to happen in future either. We should probably try to limit the focus of fests to consider only actual patches, with design discussions handled live during normal development, the way they've always been in the past. A smaller lesson was that you can't start fest without a queue of ready-to-work-on patches. We seem to be evolving towards a plan where stuff gets dumped onto the wiki page more or less immediately as it comes in. That should take care of that problem, though I'd still like to see someone accept responsibility for making sure patches get listed whether or not their author does it. Other lessons that were already brought up: * Bruce's mbox page was not a good status tool, despite his efforts to improve it; * If Bruce and I are the only ones doing anything, it's gonna be slow. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remove lossy-operator RECHECK flag?
Tom Lane wrote: Teodor Sigaev [EMAIL PROTECTED] writes: By the argument that it's better to break things obviously than to break them subtly, risking case 4 seems more attractive than risking case 2. The single thought is: usually, it's very hard to see that query returns more results that it should be. It doesn't matter for fulltext search (and it has very good chance to stay unnoticed forever because wrong rows will be sorted down by ranking function, although performance will decrease. Hmm ... that's a good point. And the performance loss that I'm complaining about is probably not large, unless you've got a *really* expensive operator. Maybe we should leave it as-is. Anybody else have an opinion? Better slow than wrong in this case. The better to break obviously than subtly argument doesn't hold here, because slow isn't the same as broken, and returning extra incorrect rows isn't obviously :-). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Tom Lane wrote: the fest, and I think it's unlikely to happen in future either. We should probably try to limit the focus of fests to consider only actual patches, with design discussions handled live during normal development, the way they've always been in the past. We have discouraged design discussions during feature freeze and beta --- do we want to change that? -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Tom Lane wrote: A smaller lesson was that you can't start fest without a queue of ready-to-work-on patches. We seem to be evolving towards a plan where stuff gets dumped onto the wiki page more or less immediately as it comes in. That should take care of that problem, though I'd still like to see someone accept responsibility for making sure patches get listed whether or not their author does it. Other lessons that were already brought up: * Bruce's mbox page was not a good status tool, despite his efforts to improve it; * If Bruce and I are the only ones doing anything, it's gonna be slow. Even after the wiki was setup there was still limited involvement in patch application except for me and Tom. The wiki is going to allow people to see more easily what is outstanding, but it doesn't seem to have translated into more people involved in finishing the commit fest. Humorously, it is like televising a football game --- more people watch, but it doesn't help the players on the field. :-( -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Bruce Momjian wrote: Tom Lane wrote: the fest, and I think it's unlikely to happen in future either. We should probably try to limit the focus of fests to consider only actual patches, with design discussions handled live during normal development, the way they've always been in the past. We have discouraged design discussions during feature freeze and beta --- do we want to change that? In fact, discussions were discouraged during the March commitfest too. Perhaps this is a good idea -- discussions on new development should be deferred until the end of the commitfest, if one is in progress. We'll see what happens on the May commitfest, but my guess is that it will be a lot shorter than the March one. If it takes 1.5-2 weeks, it's not so bad, and it means people is eager to get the current patches done as quickly as possible so that discussions on items they are interested in can continue. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Alvaro Herrera wrote: Bruce Momjian wrote: Tom Lane wrote: the fest, and I think it's unlikely to happen in future either. We should probably try to limit the focus of fests to consider only actual patches, with design discussions handled live during normal development, the way they've always been in the past. We have discouraged design discussions during feature freeze and beta --- do we want to change that? In fact, discussions were discouraged during the March commitfest too. Perhaps this is a good idea -- discussions on new development should be deferred until the end of the commitfest, if one is in progress. We'll see what happens on the May commitfest, but my guess is that it will be a lot shorter than the March one. If it takes 1.5-2 weeks, it's not so bad, and it means people is eager to get the current patches done as quickly as possible so that discussions on items they are interested in can continue. If you do that then the discussions bunch up, which is part of the reason we had 2k emails for the March commit fest. Tom is saying not to delay those discussions. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: We should probably try to limit the focus of fests to consider only actual patches, with design discussions handled live during normal development, the way they've always been in the past. We have discouraged design discussions during feature freeze and beta --- do we want to change that? No, changing that wasn't what I meant to suggest. My point here is that we'd dropped a number of big mushy discussions into the queue with the idea that they'd be re-opened during commit fest, but new discussion did not happen to any useful extent. Conclusion: don't bother putting anything less concrete than a patch on the fest queue. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Bruce Momjian wrote: Alvaro Herrera wrote: Perhaps this is a good idea -- discussions on new development should be deferred until the end of the commitfest, if one is in progress. We'll see what happens on the May commitfest, but my guess is that it will be a lot shorter than the March one. If it takes 1.5-2 weeks, it's not so bad, and it means people is eager to get the current patches done as quickly as possible so that discussions on items they are interested in can continue. If you do that then the discussions bunch up, which is part of the reason we had 2k emails for the March commit fest. Tom is saying not to delay those discussions. The problem here was that the discussions bunched up during a full year. Having them bunch up for a couple of weeks is not so bad. And the problem with those discussions continuing is that the developers participating in them are not working in the items in the queue. We want to encourage them to review the things already done, not to get distracted on new stuff. Of course, it is cat-herding, so it's not going to work 100%. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Alvaro Herrera [EMAIL PROTECTED] writes: Perhaps this is a good idea -- discussions on new development should be deferred until the end of the commitfest, if one is in progress. Well, that was part of the original idea of a commit fest, on the theory that everyone should be helping review/commit instead. But not that many people seem to have bought into it ... We'll see what happens on the May commitfest, but my guess is that it will be a lot shorter than the March one. If it takes 1.5-2 weeks, it's not so bad, and it means people is eager to get the current patches done as quickly as possible so that discussions on items they are interested in can continue. Yeah. This first one was going to be an aberration from the get-go, just because of the backlog. I don't think we should draw too many conclusions from it about how future ones will go. But they'd *better* be shorter. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: We should probably try to limit the focus of fests to consider only actual patches, with design discussions handled live during normal development, the way they've always been in the past. We have discouraged design discussions during feature freeze and beta --- do we want to change that? No, changing that wasn't what I meant to suggest. My point here is that we'd dropped a number of big mushy discussions into the queue with the idea that they'd be re-opened during commit fest, but new discussion did not happen to any useful extent. Conclusion: don't bother putting anything less concrete than a patch on the fest queue. So when/how do those discussions get resolved? -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Tom Lane [EMAIL PROTECTED] writes: Other lessons that were already brought up: * Bruce's mbox page was not a good status tool, despite his efforts to improve it; * If Bruce and I are the only ones doing anything, it's gonna be slow. How did you feel about the idea of having a Tom-free commitfest for May? You would get to sit back and comment on other people's attempt to review patches, just shouting if they seem to be headed in the wrong direction. And of course work on your own ideas you've probably been itching to do since before 8.3 feature freeze. I assume you realize it's not that I don't appreciate having you doing all this work but I think it would be a good exercise for us to go through do once. (And you certainly deserve a break!) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remove lossy-operator RECHECK flag?
Heikki Linnakangas [EMAIL PROTECTED] writes: Anybody else have an opinion? Better slow than wrong in this case. The better to break obviously than subtly argument doesn't hold here, because slow isn't the same as broken, and returning extra incorrect rows isn't obviously :-). We're talking about code which is recompiled for a new version of Postgres but not altered to return the recheck flag for every tuple? Can we rig the code so it effectively returns recheck=true all the time in that case? If so then it would be safe to ignore the recheck flag on the opclass. There's no danger of picking up code which was actually compiled with older header files after all, the magic numbers wouldn't match if it's V1 and in any case I would expect it to crash long before any mistaken tuples were returned. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
On Mon, 14 Apr 2008 21:12:59 +0100 Gregory Stark [EMAIL PROTECTED] wrote: I assume you realize it's not that I don't appreciate having you doing all this work but I think it would be a good exercise for us to go through do once. (And you certainly deserve a break!) Although I applaud the sentiment I think a more reasonable approach would be (if Tom wanted) to have Tom pick 3-5 patches (or whatever) that are his deal. That's it. No extra bubble activities for you. Except for Buddha level oversight the rest falls on the rest of the community. It isn't like we don't have at least 6 major contributors that can do patch review... Alvaro, Greg, Neil, AndrewD, Heikki, and Magnus ... Not to mention a slew of others who can at least lend a hand. Sincerely, Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: No, changing that wasn't what I meant to suggest. My point here is that we'd dropped a number of big mushy discussions into the queue with the idea that they'd be re-opened during commit fest, but new discussion did not happen to any useful extent. Conclusion: don't bother putting anything less concrete than a patch on the fest queue. So when/how do those discussions get resolved? [ shrug... ] You can't force ideas to happen on a schedule. If you don't want an issue to get forgotten, then make a TODO entry for it. But the purpose of commit fest is to make sure we deal with things that can be dealt with in a timely fashion. It's not going to cause solutions to unsolved problems to appear from nowhere. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remove lossy-operator RECHECK flag?
Gregory Stark [EMAIL PROTECTED] writes: We're talking about code which is recompiled for a new version of Postgres but not altered to return the recheck flag for every tuple? Can we rig the code so it effectively returns recheck=true all the time in that case? That's what we've got now. I just thought the choice could do with a bit harder look. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Gregory Stark [EMAIL PROTECTED] writes: How did you feel about the idea of having a Tom-free commitfest for May? Not a lot, though certainly I'd be willing to disengage from trivial patches if someone else picked them up. One problem with this fest was that a whole lot of the patches *weren't* trivial; if they had been they'd not have gotten put off till 8.4. So there weren't that many that I wanted to let go in without looking at them. I guess that's just another way in which the 8.3 schedule problem came home to roost during this fest. In future fests we should have a much higher proportion of little things that maybe more people would feel comfortable taking responsibility for. Perhaps it would be useful to try to rate pending patches by difficulty? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump object sorting
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: I should have expressed it better. The idea is to have pg_dump emit the objects in an order that allows the restore to take advantage of sync scans. So sync scans being disabled in pg_dump would not at all matter. Unless you do something to explicitly parallelize the operations, how will a different ordering improve matters? I thought we had a paper design for this, and it involved teaching pg_restore how to use multiple connections. In that context it's entirely up to pg_restore to manage the ordering and ensure dependencies are met. So I'm not seeing how it helps to have a different sort rule at pg_dump time --- it won't really make pg_restore's task any easier. Well, what actually got me going on this initially was that I got annoyed by having indexes not grouped by table when I dumped out the schema of a database, because it seemed a bit illogical. Then I started thinking about it and it seemed to me that even without synchronised scanning or parallel restoration, we might benefit from building all the indexes of a given table together, especially if the whole table could fit in either our cache or the OS cache. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, Apr 15, 2008 at 6:39 AM, Tom Lane wrote: Perhaps it would be useful to try to rate pending patches by difficulty? Just a thought, but the file size of a context diff has a pretty good correlation to the patch's intrusiveness / complexity. As a metric of difficulty it's very naive, but it's also incredibly easy to measure ... Cheers, BJ -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.7 (GNU/Linux) Comment: http://getfiregpg.org iD8DBQFIA8JF5YBsbHkuyV0RAgZ4AKDOAnKFetNEAXvMUIZZS3MUgj2yagCfZrD9 ogJzfeRMbhX/Jc37wXLDDTk= =wRkJ -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Brendan Jurd escribió: Anyway, take a look at http://wiki.postgresql.org/wiki/Template:Patch, and http://wiki.postgresql.org/wiki/User_talk:Direvus for an example of how it is used. This is my first stab at writing a patch queue entry template. It uses a similar structure to what's already on the CommitFest pages. Thanks, I changed a couple of patches to this method, and quickly hacked up added a new template for reviews. Re: horizontal whitespace in the patch template, I wonder if we can put the author name and status in the same line as the patch name. Separated from the patch name -- perhaps aligned right, if possible. Maybe something like bold blueFoobar the Bozos/bold blue (bold blackStatus/bold black) lotsa whitespace Author -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, Apr 15, 2008 at 7:00 AM, Alvaro Herrera wrote: Thanks, I changed a couple of patches to this method, and quickly hacked up added a new template for reviews. Re: horizontal whitespace in the patch template, I wonder if we can put the author name and status in the same line as the patch name. Separated from the patch name -- perhaps aligned right, if possible. Maybe something like Foobar the Bozos (Status Author I've changed Template:patch as you suggest, using float: right to shift the author over to the right hand side. Feedback welcome. Cheers, BJ -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.7 (GNU/Linux) Comment: http://getfiregpg.org iD8DBQFIA8lg5YBsbHkuyV0RAtfqAJ9I+tuemfMgwKDqzXFoJr0EptVS6QCfa6jw pSfyiOfBxrSTgb6GU5oRWO4= =yabv -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: No, changing that wasn't what I meant to suggest. My point here is that we'd dropped a number of big mushy discussions into the queue with the idea that they'd be re-opened during commit fest, but new discussion did not happen to any useful extent. Conclusion: don't bother putting anything less concrete than a patch on the fest queue. So when/how do those discussions get resolved? [ shrug... ] You can't force ideas to happen on a schedule. If you don't want an issue to get forgotten, then make a TODO entry for it. But the purpose of commit fest is to make sure we deal with things that can be dealt with in a timely fashion. It's not going to cause solutions to unsolved problems to appear from nowhere. I need is to know if they are ideas worthy of TODO. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Removing typename from A_Const (was: Empty arrays with ARRAY[])
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Fri, Mar 21, 2008 at 7:47 AM, Tom Lane wrote: I didn't do anything about removing A_Const's typename field, but I'm thinking that would be a good cleanup patch. Here's my attempt to remove the typename field from A_Const. There were a few places (notably flatten_set_variable_args() in guc.c, and typenameTypeMod() in parse_type.c) where the code expected to see an A_Const with a typename, and I had to adjust for an A_Const within a TypeCast. Nonetheless, there was an overall net reduction of 34 lines of code, so I think this was a win. All regression tests passed on x86_64 gentoo. Added to May CommitFest. Cheers, BJ -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.7 (GNU/Linux) Comment: http://getfiregpg.org iD8DBQFIA8s/5YBsbHkuyV0RAlMdAJ0dWdoZd5cypvInAR2msO8XA8qqxACeILSw bCI2TGAQI3m3TBoJspvV4OQ= =dGP9 -END PGP SIGNATURE- aconst-no-typename_0.diff.bz2 Description: BZip2 compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, Apr 15, 2008 at 2:45 AM, Alvaro Herrera wrote: Lastly, I would say that pushing submitters to enter their sent patches into the Wiki worked -- we need to ensure that they keep doing it. I'd also suggest, if you want to get serious about encouraging submitters to add their patches to the wiki: * A Big Fat Link to the current commitfest page on the main page of the wiki. * Something in the Developers' FAQ section about submitting patches with a link to the wiki and some brief instructions about how to register an account and add a patch to the queue. Cheers, BJ -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.7 (GNU/Linux) Comment: http://getfiregpg.org iD8DBQFIA9AN5YBsbHkuyV0RAm1TAKDK8Joz4zwhUx35fztOcwuNgKC6HACggj4l nStLQbEg+IYvR0CYar6md9Q= =1yGv -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] rule for update view that updates/inserts into 2 tables
I've posted this on pgsql-general and pgsql-sql, and haven't got any responses. If any of you would be able to take a look at this for me and give some feedback, I'd be obliged. I would like to create a rule that, by updating a view, allows me to update one table and insert into another. The following example illustrates what I'm trying to do: --Create Tables CREATE TABLE my_table ( my_table_id serial, a character varying(255), b character varying(255), CONSTRAINT my_table_id_pk PRIMARY KEY (my_table_id) ); CREATE TABLE my_audit_table ( audit_id serial, my_table_id int, c character varying(255), CONSTRAINT audit_id_pk PRIMARY KEY (audit_id) ); --Create View CREATE OR REPLACE VIEW my_view AS SELECT t.my_table_id, t.a, t.b, au.audit_id, au.c FROM my_table t, my_audit_table au WHERE t.my_table_id = au.my_table_id; --Create Rules CREATE OR REPLACE RULE insert_to_my_view AS ON INSERT TO my_view DO INSTEAD( INSERT INTO my_table (a,b) VALUES(new.a, new.b); INSERT INTO my_audit_table(my_table_id, c) VALUES (currval('my_table_my_table_id_seq'), new.c); ); CREATE OR REPLACE RULE update_my_view AS ON UPDATE TO my_view DO INSTEAD ( UPDATE my_table SET a = new.a, b = new.b WHERE my_table_id = old.my_table_id; INSERT INTO my_audit_table (my_table_id, c) VALUES (new.my_table_id, new.c); ); --The insert statement below inserts one row into my_table, and one row into my_audit_table --(This works the way I would like) insert into my_view(a,b,c) values('a contents','b contents', 'c contents'); --The update statement below doesn't work the way I want. --What I would like this to do is to update one row in my_table, and insert --one row into my_audit table. It does the update fine, but the insert to my_audit_table --doesn't work as I had anticipated. update my_view set a = 'new a contents', b = 'new b contents', c = 'new c contents' where my_table_id = 1; If I execute the above update statement multiple times, multiple rows will be inserted with each call after the first call. Specifically, . after the first call, 1 row is inserted . after the second call, 2 rows are inserted . after the third call, 4 rows are inserted . after the fourth call, 8 rows are inserted... and so on The problem is due to the INSERT in the update_my_view rule: INSERT INTO my_audit_table (my_table_id, c) VALUES (new.my_table_id, new.c); Apparently, new.my_table_id in this case references more than one row, if more than one row with the given id already exists in my_audit_table. How do I accomplish what I want to accomplish here? I'd prefer not to use a sp. Thanks, Chad
[HACKERS] bug in localized \df+ output
I'm seeing this: Liste des fonctions -[ RECORD 1 ]+ Schéma | public Nom | tg_backlink_a Type de données du résultat| trigger Type de données des paramètres | Volatibilité| volatile Propriétaire| alvherre Langage | plpgsql Code source | : declare : dummy\x09integer; This is \x \df+ tg_* on the regression database. server_encoding=utf8, client_encoding=utf8. Notice the alignment problem. I'm guessing that we're counting bytes, not chars, to build the table on psql. Also the \x09 thing is pretty ugly (I think this was already reported?) :-( -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached Query Plans
On Mon, Apr 14, 2008 at 5:01 PM, Csaba Nagy [EMAIL PROTECTED] wrote: On Mon, 2008-04-14 at 10:55 -0400, Mark Mielke wrote: The other ideas about automatically deciding between plans based on ranges and such strike me as involving enough complexity and logic, that to do properly, it might as well be completely re-planned from the beginning to get the most benefit. ... except if you hard-wire the most common alternative plans, you still get the benefit of cached plan for a wider range of parameter values. Not to mention that if you know you'll cache the plan, you can try harder planning it right, getting possibly better plans for complex queries... you could argue that complex queries tend not to be repeated, but we do have here some which are in fact repeated a lot in batches, then discarded. So I guess a cached plan discard/timeout mechanism would also be nice. I think ANALYZE on tables involved should _force_ replanning of cached query. After all, if ANALYZE was fired, then contents changed substantially and replanning feels like a good idea. As for planner getting smarter (and slower ;)) -- complex queries tend not to be repeated -- so it is worth the trouble to plan them carefully. These would benefit from smarter planer with or without caching. The problem is with simple queries, which can be argued are a majority of queries. its where the caching comes in. If you cache the queries, you can let the planner be smarter (and slower). If you don't cache, you probably don't want trade frequent simple query's speed for once in a while complex query. That stated, for me the most important feature is the possibility to have a good online query statistics. :) Regards, Dawid -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
On Mon, Apr 14, 2008 at 6:45 PM, Alvaro Herrera [EMAIL PROTECTED] wrote: [...] As far as the Wiki page is concerned, it would be good to make sure the entries have a bit more info than just a header line -- things such as author, who reviewed and what did the reviewer say about it. Some of it is already there. Something else we learned is that the archives are central (well, we already knew that, but I don't think we had ever given them so broad use), and we've been making changes to them so that they are more useful to reviewers. Further changes are still needed on them, of course, to address the remaining problems. Lastly, I would say that pushing submitters to enter their sent patches into the Wiki worked -- we need to ensure that they keep doing it. I think this should be explained nicely in developer FAQ. The whole process preferably. As a first time contributor ;) I must say I was (and still am, a bit) confused about the process. The FAQ point 1.4 says to discuss it on -hakers unless its a trivial patch. I thought the patch would be trivial, sent it to -patches. Then, later on I thought that perhaps it should be discussed on the -hackers nonetheless, so I have written there also: http://archives.postgresql.org/pgsql-hackers/2008-04/msg00147.php then the patch got rejected, if I understand correctly. Now assuming I want to prepare patch for something else, at what point does Wiki come in? Should I send it to -patches and put it on wiki? Or perhaps wait for some developer's suggestion put it on the wiki? Should I start discussion on -hackers or is -patches enough? I know that with time they look trivial -- but at least I felt quite uncertain about them when sending first patch. . Don't forget to update developer FAQ as well. :) Regards, Dawid -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
Dawid Kuroczko escribió: I thought the patch would be trivial, sent it to -patches. Then, later on I thought that perhaps it should be discussed on the -hackers nonetheless, so I have written there also: http://archives.postgresql.org/pgsql-hackers/2008-04/msg00147.php then the patch got rejected, if I understand correctly. The problem is that the patch was initially trivial, but it turned into a much larger redesign of command handling. I think that's a great turnout for a submission. Don't forget to update developer FAQ as well. :) Agreed -- the FAQs and other documents do not cover the processes we're currently following. Mind you, the processes are quite young. (More reason to have them better documented I guess.) -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lessons from commit fest
On Mon, 14 Apr 2008 21:25:28 -0400 Alvaro Herrera [EMAIL PROTECTED] wrote: The problem is that the patch was initially trivial, but it turned into a much larger redesign of command handling. I think that's a great turnout for a submission. Don't forget to update developer FAQ as well. :) Agreed -- the FAQs and other documents do not cover the processes we're currently following. Mind you, the processes are quite young. (More reason to have them better documented I guess.) We can change the FAQ per commit fest as things grow and as each commit fest starts, send out the policy for the commit fest on announce. Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers