Emre Hasegeli writes:
> [ inet-selfuncs-v14.patch ]
After further reflection I concluded that the best way to deal with the
O(N^2) runtime problem for the join selectivity function was to set a
limit on the number of statistics values we'd consider, as was discussed
awhile back IIRC. We can easi
New version of the patch attached with the optimization to break the
loop before looking at all of the histogram values. I can reduce
join selectivity estimation runtime by reducing the values of the
left hand side or both of the sides, if there is interest.
> > Even if the above aspects of the c
On Wed, Dec 3, 2014 at 6:14 AM, Emre Hasegeli wrote:
>> Actually, there's a second large problem with this patch: blindly
>> iterating through all combinations of MCV and histogram entries makes the
>> runtime O(N^2) in the statistics target. I made up some test data (by
>> scanning my mail logs)
> Actually, there's a second large problem with this patch: blindly
> iterating through all combinations of MCV and histogram entries makes the
> runtime O(N^2) in the statistics target. I made up some test data (by
> scanning my mail logs) and observed the following planning times, as
> reported
> I spent a fair chunk of the weekend hacking on this patch to make
> it more understandable and fix up a lot of what seemed to me pretty
> clear arithmetic errors in the "upper layers" of the patch. However,
> I couldn't quite convince myself to commit it, because the business
> around estimation
I wrote:
> I spent a fair chunk of the weekend hacking on this patch to make
> it more understandable and fix up a lot of what seemed to me pretty
> clear arithmetic errors in the "upper layers" of the patch. However,
> I couldn't quite convince myself to commit it, because the business
> around e
Emre Hasegeli writes:
>> Thanks. Overall, my impression of this patch is that it works very
>> well. But damned if I understood *how* it works :-). There's a lot
>> of statistics involved, and it's not easy to see why something is
>> multiplied by something else. I'm adding comments as I read thro
> Thanks. Overall, my impression of this patch is that it works very
> well. But damned if I understood *how* it works :-). There's a lot
> of statistics involved, and it's not easy to see why something is
> multiplied by something else. I'm adding comments as I read through
> it.
Thank you for lo
On 09/07/2014 07:09 PM, Emre Hasegeli wrote:
I updated the patch to cover semi and anti joins with eqjoinsel_semi().
I think it is better than returning a constant.
What you did there is utterly unacceptable from a modularity standpoint;
and considering that the values will be nowhere near righ
>
> New version with semi join estimation function attached.
>
I have performed the following initial review:
- Patch format. -- submitted as unified, but not sure it makes it any
easier to read than context format.
- Apply to current master (77e65bf). -- success (though, I do get
"Stripping tra
> > > I updated the patch to cover semi and anti joins with eqjoinsel_semi().
> > > I think it is better than returning a constant.
> >
> > What you did there is utterly unacceptable from a modularity standpoint;
> > and considering that the values will be nowhere near right, the argument
> > that
> What you did there is utterly unacceptable from a modularity standpoint;
> and considering that the values will be nowhere near right, the argument
> that "it's better than returning a constant" seems pretty weak. I think
> you should just take that out again.
I will try to come up with a bette
> Heikki Linnakangas writes:
> > * inet_mcv_join_selec() is O(n^2) where n is the number of entries in
> > the MCV lists. With the max statistics target of 1, a worst case
> > query on my laptop took about 15 seconds to plan. Maybe that's
> > acceptable, but you went through some trouble to
> * Isn't "X >> Y" equivalent to "network_scan_first(X) < Y AND
> network_scan_last(X) > Y"? Or at least close enough for selectivity
> estimation purposes? Pardon my ignorance - I'm not too familiar with the
> inet datatype - but how about just calling scalarineqsel for both bounds?
Actually,
Heikki Linnakangas writes:
> * inet_mcv_join_selec() is O(n^2) where n is the number of entries in
> the MCV lists. With the max statistics target of 1, a worst case
> query on my laptop took about 15 seconds to plan. Maybe that's
> acceptable, but you went through some trouble to make plan
Emre Hasegeli writes:
> I updated the patch to cover semi and anti joins with eqjoinsel_semi().
> I think it is better than returning a constant.
What you did there is utterly unacceptable from a modularity standpoint;
and considering that the values will be nowhere near right, the argument
that
On 08/26/2014 12:44 PM, Emre Hasegeli wrote:
I agree with you that we can support other join type and anti join later,
If others don’t have any objection in doing other parts later I will mark as "Ready
For Committer".
I updated the patch to cover semi and anti joins with eqjoinsel_semi().
I t
> I agree with you that we can support other join type and anti join later,
> If others don’t have any objection in doing other parts later I will mark as
> "Ready For Committer".
I updated the patch to cover semi and anti joins with eqjoinsel_semi().
I think it is better than returning a constan
On 12 July 2014 23:25, Emre Hasegeli Wrote,
> > I have one last comment, after clarifying this I can move it to
> "ready for committer".
> > 1. In networkjoinsel, For avoiding the case of huge statistics, only
> some of the values from mcv and histograms are used (calculated using
> SQRT).
> > --
> I have one last comment, after clarifying this I can move it to "ready for
> committer".
> 1. In networkjoinsel, For avoiding the case of huge statistics, only some of
> the values from mcv and histograms are used (calculated using SQRT).
> -- But in my opinion, if histograms and mcv both are e
On 06 July 2014 20:33, Emre Hasegeli Wrote,
>
> > Apart from that there is one spell check you can correct
> > -- in inet_his_inclusion_selec comments histogram boundies ->
> > histogram boundaries :)
>
> I fixed it. New version attached. The debug log statements are also
> removed.
I have do
Thank you for looking at it.
> In inet_his_inclusion_selec function,
> When the constant matches only the right side of the bucket, and if it’s a
> last bucket then it's never considered as partial match candidate.
> In my opinion, if it's not a last bucket then for next bucket it will become
>
On, 15 May 2014 14:04 Emre Hasegeli Wrote,
>
> * matching first MCV to second MCV
> * searching first MCV in the second histogram
> * searching second MCV in the first histogram
> * searching boundaries of the first histogram in the second histogram
>
> Comparing the lists with each other slows
I wanted to check the patch last time and found a bug effecting
MVC vs MVC part of the join selectivity. Fixed version attached.
Emre Hasegeli :
> Comparing the lists with each other slows down the function when
> statistics set to higher values. To avoid this problem I only use
> log(n) values of
New version of the selectivity estimation patch attached. I am adding
it to CommitFest 2014-06. Previous version of it reviewed by
Andreas Karlson on the previous CommitFest with the GiST support patch.
The new version includes join selectivity estimation.
Join selectivity is calculated in 4 steps
25 matches
Mail list logo