On Mon, Mar 17, 2014 at 11:50 PM, Merlin Moncure <mmonc...@gmail.com> wrote:
> On Mon, Mar 17, 2014 at 12:57 PM, Atri Sharma <atri.j...@gmail.com> wrote: > > > >> There's a big difference between saying to the planner, "Use plan X" > >> vs "Here's some information describing the data supporting choosing > >> plan X intelligently". The latter allows for better plans in the face > >> of varied/changing data, integrates with the planner in natural way, > >> and encourages users to understand how the planner works. > > > > +1 > > > > I was thinking of varying the 'weight' of a user defined plan by an fixed > > experimental factor to tell the planner to give higher/lower preference > to > > this plan, but after your idea above, I think Stephen's point of > introducing > > a GUC for the factor is the only way possible and I agree with him on the > > point that eventually the user will figure out a way to force usage of > his > > plan using the GUC. > > GUC is not the answer beyond the "broad brush" mostly debugging level > features they already support. What do you do if your plan > simultaneously needs and does not need nestloops? > > A query plan is a complicated thing that is the result of detail > analysis of the data. I bet there are less than 100 users on the > planet with the architectural knowledge of the planner to submit a > 'plan'. What users do have is knowledge of the data that the database > can't effectively gather for some reason. Looking at my query above, > what it would need (assuming the planner could not be made to look > through length()) would be something like: > > SELECT * FROM foo WHERE > length(bar) <= 1000 WITH SELECTIVITY 0.999 > AND length(bar) >= 2 WITH SELECTIVITY 0.999; > > Wont this have scaling issues and issues over time as the data in the table changes? Suppose I make a view with the above query. With time, as the data in the table changes, the selectivity values wont be good for planning. This may potentially lead to a lot of changes in the view definition and other places where this query was used. In general, I think I step back on my point that specifying the selectivity is a bad idea. Could this also work (for the time being) for cross-column statistics? Regards, Atri -- Regards, Atri *l'apprenant*