subject:"Adaptive Plan Sharing for PreparedStmt"

Re: Adaptive Plan Sharing for PreparedStmt

2021-05-25 Thread Andy Fan

Hi,

On Thu, May 20, 2021 at 5:02 PM Tomas Vondra 
wrote:

> Hi,
>
> On 5/20/21 5:43 AM, Andy Fan wrote:
> > Currently we are using a custom/generic strategy to handle the data skew
> > issue. However, it doesn't work well all the time. For example:  SELECT *
> > FROM t WHERE a between $1 and $2. We assume the selectivity is 0.0025,
> > But users may provide a large range every time. Per our current strategy,
> > a generic plan will be chosen, Index scan on A will be chosen. oops..
> >
>
> Yeah, the current logic is rather simple, which is however somewhat on
> purpose, as it makes the planning very cheap. But it also means there's
> very little info to check/compare and so we may make mistakes.
>
> > I think Oracle's Adaptive Cursor sharing should work. First It calculate
> > the selectivity with the real bind values and generate/reuse different
> plan
> > based on the similarity of selectivity. The challenges I can think of
> > now are:
> > a). How to define the similarity.  b). How to adjust the similarity
> > during the
> > real run. for example, we say [1% ~ 10%] is similar. but we find
> > selectivity 20%
> > used the same plan as 10%. what should be done here.
> >
>
> IMO the big question is how expensive this would be. Calculating the
> selectivities for real values (i.e. for each query) is not expensive,
> but it's not free either. So even if we compare the selectivities in
> some way and skip the actual query planning, it's still going to impact
> the prepared statements.
>

That's true if we need to do this every time.  We may just need to do
this on some cases where the estimation is likely to be wrong, like a > $1;
or
a between $1 and $2;  In such cases, we just use the hard coded value
currently.


> Also, we currently don't have any mechanism to extract the selectivities
> from the whole query - not sure how complex that would be, as it may
> involve e.g. join selectivities.
>
> The idea in my mind is just checking the quals on base relations. like
t1.a > $1.
So for something like t1.a + t2.a > $1 will be ignored.


>
> As for how to define the similarity, I doubt there's a simple and
> sensible/reliable way to do that :-(
>
> I remember reading a paper about query planning in which the parameter
> space was divided into regions with the same plan. In this case the
> parameters are selectivities for all the query operations. So what we
> might do is this:
>
> 1) Run the first N queries and extract the selectivities / plans.
>
> 2) Build "clusters" of selecitivies with the same plan.
>
> 3) Before running a query, see if it the selectivities fall into one of
> the existing clusters. If yes, use the plan. If not, do regular
> planning, add it to the data set and repeat (2).
>
> I have no idea how expensive would this be, and I assume the "clusters"
> may have fairly complicated shapes (not simple convex regions).
>
>
Thanks for sharing this,  we do have lots of things to do here.  Your idea
should be a good place to start with.

-- 
Best Regards
Andy Fan (https://www.aliyun.com/)

Re: Adaptive Plan Sharing for PreparedStmt

2021-05-20 Thread Tomas Vondra


Hi,

On 5/20/21 5:43 AM, Andy Fan wrote:

Currently we are using a custom/generic strategy to handle the data skew
issue. However, it doesn't work well all the time. For example:  SELECT *
FROM t WHERE a between $1 and $2. We assume the selectivity is 0.0025,
But users may provide a large range every time. Per our current strategy,
a generic plan will be chosen, Index scan on A will be chosen. oops..



Yeah, the current logic is rather simple, which is however somewhat on 
purpose, as it makes the planning very cheap. But it also means there's 
very little info to check/compare and so we may make mistakes.



I think Oracle's Adaptive Cursor sharing should work. First It calculate
the selectivity with the real bind values and generate/reuse different plan
based on the similarity of selectivity. The challenges I can think of 
now are:
a). How to define the similarity.  b). How to adjust the similarity 
during the
real run. for example, we say [1% ~ 10%] is similar. but we find 
selectivity 20%

used the same plan as 10%. what should be done here.



IMO the big question is how expensive this would be. Calculating the 
selectivities for real values (i.e. for each query) is not expensive, 
but it's not free either. So even if we compare the selectivities in 
some way and skip the actual query planning, it's still going to impact 
the prepared statements.


Also, we currently don't have any mechanism to extract the selectivities 
from the whole query - not sure how complex that would be, as it may 
involve e.g. join selectivities.



As for how to define the similarity, I doubt there's a simple and 
sensible/reliable way to do that :-(


I remember reading a paper about query planning in which the parameter 
space was divided into regions with the same plan. In this case the 
parameters are selectivities for all the query operations. So what we 
might do is this:


1) Run the first N queries and extract the selectivities / plans.

2) Build "clusters" of selecitivies with the same plan.

3) Before running a query, see if it the selectivities fall into one of 
the existing clusters. If yes, use the plan. If not, do regular 
planning, add it to the data set and repeat (2).


I have no idea how expensive would this be, and I assume the "clusters" 
may have fairly complicated shapes (not simple convex regions).



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Adaptive Plan Sharing for PreparedStmt

2021-05-19 Thread Andy Fan

Currently we are using a custom/generic strategy to handle the data skew
issue. However, it doesn't work well all the time. For example:  SELECT *
FROM t WHERE a between $1 and $2. We assume the selectivity is 0.0025,
But users may provide a large range every time. Per our current strategy,
a generic plan will be chosen, Index scan on A will be chosen. oops..

I think Oracle's Adaptive Cursor sharing should work. First It calculate
the selectivity with the real bind values and generate/reuse different plan
based on the similarity of selectivity. The challenges I can think of now
are:
a). How to define the similarity.  b). How to adjust the similarity during
the
real run. for example, we say [1% ~ 10%] is similar. but we find
selectivity 20%
used the same plan as 10%. what should be done here.


I am searching for the best place to invest in the optimizer aspect. and
the above idea should be the one I can think of now.  Any thought?


Thanks
-- 
Best Regards
Andy Fan (https://www.aliyun.com/)

Re: Adaptive Plan Sharing for PreparedStmt

Re: Adaptive Plan Sharing for PreparedStmt

Adaptive Plan Sharing for PreparedStmt

3 matches

Site Navigation

Mail list logo

Footer information