Re: [PATCHES] [HACKERS] [Fwd: Index Advisor]
On Tue, 2007-04-10 at 12:18 -0700, Gurjeet Singh wrote: > Also, although the whole plan-tree is available in > get_relation_info(), but it wouldn't be the right place to scan other > tables, for eg., for generating JOIN-INDEXes or materializing some > intermediate joins. (sometime in the future we may support them!). I like Tom's suggestion. We never thought actually creating the indexes was a very good thing and I'd be happy to bury that idea for good. Speed is definitely a consideration if we are to re-plan thousands of SQL statements for a real workload. > If we don't run the planner twice, then the developer will have to > run it manually twice, and compare the costs manually (with and > without v-indexes); virtually impossible for lage applications and > introduction of another human-error possibility. AFAICS Tom hasn't referred to running twice or not, so I'm not very sure what you're referring to, sorry. If you could answer Tom's suggestions one by one directly underneath them it would be easier to discuss things. ISTM that you've done a great job, the trick is now to reach agreement and finish this. If there is something still to discuss, it needs to be very clearly tied back to Tom's comments so everyone can follow it, then agree it. If there is a problem in Tom's suggestions that directly effects the operation of the tool then we need to identify what that is. But if those hooks would give us all we need, then lets agree it and fix up the adviser plug-in later. We really, really, really need this. Lots. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] [Fwd: Index Advisor]
On 4/12/07, Bruce Momjian <[EMAIL PROTECTED]> wrote: Gurjeet Singh wrote: > The interface etc. may not be beautiful, but it isn't ugly either! It is > a lot better than manually creating pg_index records and inserting them into > cache; we use index_create() API to create the index (build is deferred), > and then 'rollback to savepoint' to undo those changes when the advisor is > done. index_create() causes pg_depends entries too, so a 'RB to SP' is far > much safer than going and deleting cache records manually. My complaint was not that the API used in the code was non-optimal(which I think was Tom's issue), but that the _user_ API was not very clean. Not sure what to recommend, but I will think about it later. That can be fixed/improved with minimal efforts, but if it is the internal API usage, or the architecture we're bothered about, then IMO just an overhaul of the code will not be sufficient, rather, it will require rework from scratch. Best regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com 17°29'34.37"N 78°30'59.76"E - Hyderabad 18°32'57.25"N 73°56'25.42"E - Pune *
Re: [PATCHES] [HACKERS] [Fwd: Index Advisor]
Gurjeet Singh wrote: > The interface etc. may not be beautiful, but it isn't ugly either! It is > a lot better than manually creating pg_index records and inserting them into > cache; we use index_create() API to create the index (build is deferred), > and then 'rollback to savepoint' to undo those changes when the advisor is > done. index_create() causes pg_depends entries too, so a 'RB to SP' is far > much safer than going and deleting cache records manually. My complaint was not that the API used in the code was non-optimal(which I think was Tom's issue), but that the _user_ API was not very clean. Not sure what to recommend, but I will think about it later. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] [Fwd: Index Advisor]
Hi Tom, The original patch was submitted by Kai Sattler, and we (at EDB) spent a lot of time improving it, making it as seamless and as user-friendly as possible. As is evident from the version number of the patch (v26), it has gone through a lot of iterations, and was available to the community for review and discussion (and discuss they did; they asked for a few things and those were added/improved). I am thinking the API needs to be simpified, perhaps by removing the system table and having the recommendations just logged to the server logs. This means that this very useful information (in log files) would need to be passed through an intermediary or another tool developed to allow access to this information. I think that having this available from a table would be very nice. In the initial submission, the feature was a big piece of code embedded inside the backend. It required a system table, did not show the new plan, actually created index physically before re planning, and could not advise for a running application (everything had to be manually EXPLAINed). I read through the thread titled "Index Tuning Features" that first discussed the idea of an Index adviser for PG, and this patch also meets quite a few requirements raised there. Here are a few of the good things about this patch as of now: .) Loadable plugin. Develop your own plugin to do nifty things with the plan generated by the planner. Just as the debugger is implemented; if no plugin... no work to do... .) No syntax change. Run your queries as they are and get the advice in the advise_index table (or set client_min_messages = LOG, to see the improved plan on the screen also, if any). .) Can recommend indexes even for the generated dynamic-queries, that are hard to regenerate in a dry-run. .) Can recommend indexes for SQL being executed through plpgsql (or any PL) (again, hard to regenerate the parameterized queries by hand), and the the advice is available in the advise_index table. .) The adviser dumps it's advice in a table named advise_index. That can be a user table, or a view with INSERT rule, or anything else; it should just be an INSERTable object, accessible to the executing user (as opposed to a system table required by the original implementation, and hence a need for initdb). .) No need to modify the application in any way; just set PGOPTIONS environment variable properly before executing the appln., and run it as usual... you have the advice generated for you. .) No need for DBA (or the appln. writer) to feed anything to the planner in any way; the process of recommendation is fully automated (this may change if another plugin implimentation requires the stats in some user table). .) Does recommend multi-column indexes. Does not make a set of each fathomable combination of table columns to develop multi-column indexes (hence avoiding a combinatorial explosion of time-space requirements); it uses the columns used in the query to generate multi-column indexes. .) The indexes are not created on disk; the index-tuple-size calculation function does a very good job of estimating the size of the virtual index. .) The changes to the catalog are just for the backend running under the adviser, no one else can see those virtual indexes (as opposed to the earlier implementation where the indexes were created on-disk, and available to all the backends in the planning phase). So, with one hook (no GUC variables!), we get all these cool things. I tried very hard to eliminate that one leftover kludge, but couldn't (we have two options here, and they are enclosed in '#if GLOBAL_CAND_LIST ... #else' parts of the code; left upto the committers to decide which one we need!). Another kludge that I had to add was the SPI_connect() and SPI_finish() frame around the savepoint handling, since the RollbackToSavepoint in xact.cassumes that only a PL/* module must be using the savepoint infrastucture (this was discussed on -hackers). The interface etc. may not be beautiful, but it isn't ugly either! It is a lot better than manually creating pg_index records and inserting them into cache; we use index_create() API to create the index (build is deferred), and then 'rollback to savepoint' to undo those changes when the advisor is done. index_create() causes pg_depends entries too, so a 'RB to SP' is far much safer than going and deleting cache records manually. I hope you would agree that we need two passes of planner, one without v-indexes and the other with v-indexes, for the backend to compare the costs, and recommend indexes only if the second plan turned out to be cheaper. If we implement the way you have suggested, then we will need one hook at the end of get_relation_info(), one in EXPLAIN code, and yet another, someplace after planner is finished, to do the comparison of the two plans and recommend only those indexes that were considered to be useful by the planner. (A total of three places to touch).
Re: [PATCHES] [HACKERS] [Fwd: Index Advisor]
"Gurjeet Singh" <[EMAIL PROTECTED]> writes: > Please find attached the latest version of the patch. It applies cleanly on > REL8_2_STABLE. The interface to the planner in this seems rather brute-force. To run a plan involving a hypothetical index, you have to make a bunch of catalog entries, run the planner, and then roll back the transaction to get rid of the entries. Slow, ugly, and you still need another kluge to keep the planner from believing the index has zero size. It strikes me that there is a better way to do it, because 99% of the planner does not look at the system catalog entries --- all it cares about is the IndexOptInfo structs set up by plancat.c. So there's not really any need to make catalog entries at all AFAICS. Rather, the best thing would be a plugin hook at the end of get_relation_info() that would have a chance to editorialize on the constructed IndexOptInfo list (and maybe other properties of the RelOptInfo too). You could remove existing index entries or insert new ones. I'm dissatisfied with the hard-wired hook into planner() also. That doesn't provide any extensibility nor allow the index adviser to be implemented as a loadable plugin. I'm inclined to think it's in the wrong place anyway; you've got thrashing around there to avoid recursion but it's very fragile. Having to dump the results into the postmaster log isn't a nice user API either. Perhaps what would be better is a hook in EXPLAIN to call a plugin that can add more lines to EXPLAIN's output, and is passed the original query and plan so that it can re-call the planner with hypothetical indexes prepared for insertion by the other hook. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [pgsql-patches] [PATCHES] [HACKERS] [Fwd: Index Advisor]
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Gurjeet Singh wrote: > On 1/20/07, Bruce Momjian <[EMAIL PROTECTED]> wrote: > > > > > > I can't read a 7z file on my end. Please email me the file and I will > > put it at a URL. > > > > > > --- > > > > Gurjeet Singh wrote: > > > Please find attached the patches ported to HEAD as of now. The patch to > > the > > > contrib modules is the same as before; the version number has been kept > > but > > > branch designator has been changed. > > > > > > 1) pg_post_planner_plugin-HEAD_20070116-v2.patch.gz > > > 2) pg_index_adviser-HEAD_20070116-v26.7z > > > > > > > I am attaching the .gz versions of both the patches, and CC'ing to -patches > also. If it doesn't turn up on -patches even this time, then please do the > needful. > > Thanks and best regards, > > -- > [EMAIL PROTECTED] > [EMAIL PROTECTED] gmail | hotmail | yahoo }.com [ Attachment, skipping... ] [ Attachment, skipping... ] > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [pgsql-patches] [PATCHES] [HACKERS] [Fwd: Index Advisor]
On 1/20/07, Bruce Momjian <[EMAIL PROTECTED]> wrote: I can't read a 7z file on my end. Please email me the file and I will put it at a URL. --- Gurjeet Singh wrote: > Please find attached the patches ported to HEAD as of now. The patch to the > contrib modules is the same as before; the version number has been kept but > branch designator has been changed. > > 1) pg_post_planner_plugin-HEAD_20070116-v2.patch.gz > 2) pg_index_adviser-HEAD_20070116-v26.7z > I am attaching the .gz versions of both the patches, and CC'ing to -patches also. If it doesn't turn up on -patches even this time, then please do the needful. Thanks and best regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com pg_post_planner_plugin-HEAD_20070116-v2.patch.gz Description: GNU Zip compressed data pg_index_adviser-HEAD_20070116-v26.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [pgsql-patches] [PATCHES] [HACKERS] [Fwd: Index Advisor]
I can't read a 7z file on my end. Please email me the file and I will put it at a URL. --- Gurjeet Singh wrote: > On 1/15/07, Andrew Dunstan <[EMAIL PROTECTED]> wrote: > > > > Gurjeet Singh wrote: > > > > > > 1) pg_post_planner_plugin-REL8_2_STABLE-v1.patch.gz > > > 2) pg_index_adviser-REL8_2_STABLE-v26.patch.gz > > > > > Why are these patches against 8.2 rather than CVS HEAD? Is this not a > > new feature? We never backport new features to the stable branches - > > that's what makes them stable ;-) > > > > > Please find attached the patches ported to HEAD as of now. The patch to the > contrib modules is the same as before; the version number has been kept but > branch designator has been changed. > > 1) pg_post_planner_plugin-HEAD_20070116-v2.patch.gz > 2) pg_index_adviser-HEAD_20070116-v26.7z > > Best regards, > > -- > [EMAIL PROTECTED] > [EMAIL PROTECTED] gmail | hotmail | yahoo }.com [ Attachment, skipping... ] [ Attachment, skipping... ] > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [pgsql-patches] [PATCHES] [HACKERS] [Fwd: Index Advisor]
On 1/15/07, Andrew Dunstan <[EMAIL PROTECTED]> wrote: Gurjeet Singh wrote: > > 1) pg_post_planner_plugin-REL8_2_STABLE-v1.patch.gz > 2) pg_index_adviser-REL8_2_STABLE-v26.patch.gz > Why are these patches against 8.2 rather than CVS HEAD? Is this not a new feature? We never backport new features to the stable branches - that's what makes them stable ;-) Please find attached the patches ported to HEAD as of now. The patch to the contrib modules is the same as before; the version number has been kept but branch designator has been changed. 1) pg_post_planner_plugin-HEAD_20070116-v2.patch.gz 2) pg_index_adviser-HEAD_20070116-v26.7z Best regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com pg_post_planner_plugin-HEAD_20070116-v2.patch.gz Description: GNU Zip compressed data pg_index_adviser-HEAD_20070116-v26.7z Description: Binary data ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [pgsql-patches] [PATCHES] [HACKERS] [Fwd: Index Advisor]
> Please find attached two patches: > > 1) pg_post_planner_plugin-REL8_2_STABLE-v1.patch.gz > 2) pg_index_adviser-REL8_2_STABLE-v26.patch.gz > Why are these patches against 8.2 rather than CVS HEAD? Is this not a new feature? We never backport new features to the stable branches - that's what makes them stable ;-) Good point... I always worked only on a tag or a branch, knowing that it is in some stable state, and won't have to be bothered by critical (and sometimes un-compilable) changes to the source tree. These patches *should* apply cleanly to the head too, since the first one touches the code in very stable places and the second one contains only new files. I'll port these patches to the head and get back to the list. Thanks, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com
Re: [pgsql-patches] [PATCHES] [HACKERS] [Fwd: Index Advisor]
Gurjeet Singh wrote: Please find attached two patches: 1) pg_post_planner_plugin-REL8_2_STABLE-v1.patch.gz 2) pg_index_adviser-REL8_2_STABLE-v26.patch.gz Why are these patches against 8.2 rather than CVS HEAD? Is this not a new feature? We never backport new features to the stable branches - that's what makes them stable ;-) cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [pgsql-patches] [PATCHES] [HACKERS] [Fwd: Index Advisor]
Hi Bruce, I have not been able to send this file across since the last two days. In the past I have been able to send 31KB attachments to patches, but I donno why it's not getting through this time. I have tried different levels of compression in different formats, and still it won't let the mail through. If it doesn't get through to the list even this time, can you please do the needful and get it posted on the patches. Best regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com This is the second installment of the Index Adviser patch. It contains the item number (2) mentioned below. On 1/13/07, Gurjeet Singh < [EMAIL PROTECTED]> wrote: Hi All, Please find attached two patches: 1) pg_post_planner_plugin-REL8_2_STABLE-v1.patch.gz 2) pg_index_adviser-REL8_2_STABLE-v26.patch.gz -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com pg_index_adviser-REL8_2_STABLE-v26.7z Description: Binary data ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [pgsql-patches] [PATCHES] [HACKERS] [Fwd: Index Advisor]
It seems the size restriction has blocked my previous attempt. Please find the first patch attached, and the second one will be in the next mail. Best Regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com On 1/13/07, Gurjeet Singh <[EMAIL PROTECTED]> wrote: On 1/9/07, Gurjeet Singh < [EMAIL PROTECTED]> wrote: Now that there's just one call to the Index Adviser (from planner()) we can now move forward in making it a plugin. Hi All, Please find attached two patches: 1) pg_post_planner_plugin-REL8_2_STABLE-v1.patch.gz 2) pg_index_adviser-REL8_2_STABLE-v26.patch.gz Patch 1 introduces the infrastructure to call plugins from the tail-end of the planner() function. The planner looks for a list of PPPFunctions (PostPlannerPluginFunctions) in a rendezvous variable, and then calls the 'driver' callback into the plugin. This patch also adds a new function in explain.c that can be used to generate a string similar to the output of the EXPLAIN command. It also adds a harmless DLLIMPORT to some global variables that were needed by the Index Adviser Plugin. Patch 2 is the plugin version of the Index Adviser and the advise tool. It creates two folders in the contrib module: pg_index_adviser and pg_advise_index. The pg_index_adviser folder contains the updated README. Both the folders contain their respective updated sample_*.[sql|txt] files. Theres one point that needs attention in the patch 1. The code enclosed in GLOBAL_CAND_LIST is a hack, which I couldn't get rid of. In plancat.c we have two options to estimate the number of pages that would be occupied by a virtual index: i) Make a call back into the plugin to get the estimation. The code enabled by GLOBAL_CAND_LIST implements this. ii) We can allow the plugin to update the pg_class.relpages entry for each virtual index, and the planner will pickup the values from there. The code disabled by GLOBAL_CAND_LIST implements this. Option (ii) would be ideal but the core members can be a better judge of this. Is there any other way of doing this? Best regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com pg_post_planner_plugin-REL8_2_STABLE-v1.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] [Fwd: Index Advisor]
On 1/9/07, Gurjeet Singh <[EMAIL PROTECTED]> wrote: I have another idea for making the hooks a bit more cleaner; I will try that and run it through you guys later today. Please find attached the latest version of the patch. It applies cleanly on REL8_2_STABLE. The following restrictions are applied before generating an index candidate (this is not mentioned in the README): .) It should be a base relation; We do not generate advisory for a view or Set Returning Function or something else. .) It should not be a system table; like pg_class etc. .) It should not be a temporary table. (May be we can allow these; opinions please!!) .) We do not recommend indexes on system columns; like oid, ctid etc. .) The relation should have at least two pages. .) The relation should have at least two tuples. The last two restrictions put the onus on the user to keep the table ANALYZEd or VACUUMed. I have moved the calls to index_adviser() from two other places to the end of planner(). IMO, that is the best place to place a call to the Adviser; instead of duplicating code in every caller of planner(). index_adviser() makes sure that it doesn't get called recursively. This change however costs us the loss of ability to append suggested plan to the existing plan, if being called by the EXPLAIN command. Instead, it now uses the newly written function explain_getPlanString() in explain.cto get the string representation of the plan, and then emits it as elog( LOG,...). The only kludge left now is the code enclosed in '#if GLOBAL_CAND_LIST' in plancat.c. We need to decide whether we need the 'if' part or the 'else' part! I already see a strong objection to the 'else' option, since it is very close to the core of the optimizer! Opinions needed. After adding the CREATE TABLE advise_index(...) script to src/test/regress/sql/create_table.sql and enabling the GUC in the conf file, 'make installcheck' runs fine, with a few acceptable diffs. Moreover, post the 'make' run, we can see there are a few advises for the tables involved in the test run. Four of them are ob serial columns of hash-index-testing tables, so they don't make much point; but the rest three of them are on big tables, and one of them is a multi-column-index suggestion. Now that there's just one call to the Index Adviser (from planner()) we can now move forward in making it a plugin. Best regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com pg_index_adviser-REL8_2_STABLE-v23.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] [Fwd: Index Advisor]
Hi All, Please find attached the latest version of the patch attached. It is based on REL8_2_STABLE. It includes a few bug fixes and an improvement to the size estimation function. It also includes a work-around to circumvent the problem we were facing earlier in xact.c; it now fakes itself to be a PL/xxx module by surrounding the BIST()/RARCST() calls inside an SPI_connect()/SPI_finish() block. Please note that the sample_*.txt files in the contrib module, which show a few different sample runs, may be a little out of date. Best regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com pg_index_adviser-REL8_2_STABLE20061208-v22.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings