Dean Rasheed writes:
> Here's an updated patch with references to both papers, and a more
> detailed justification for the formula, along with the other changes
> discussed. Note that although equation (2) in the Dell'Era paper looks
> different from the Yao formula, it's actually the same.
Looks
On 31 March 2016 at 22:02, Tom Lane wrote:
> I'm just concerned about what happens when the Dellera paper stops being
> available. I don't mind including that URL as a backup to the written-out
> argument I just suggested.
>
Here's an updated patch with references to both papers, and a more
deta
Alvaro Herrera writes:
> Tom Lane wrote:
>> See "Approximating block accesses in database organizations", S. B. Yao,
>> Communications of the ACM, Volume 20 Issue 4, April 1977 Pages 260-261
> That sounds nice all right, but I'm not sure it's actually helpful,
> because the article text is not av
Dean Rasheed writes:
> Yeah, that makes sense. In fact, if we only apply the adjustment when
> reldistinct > 0 and rel->rows < rel->tuples, and rewrite the first
> argument to pow() as (rel->tuples - rel->rows) / rel->tuples, then it
> is guaranteed to be non-negative. If rel->rows >= rel->tuples
Dean Rasheed writes:
> On 31 March 2016 at 21:40, Tom Lane wrote:
>> Let's use something like this:
>> See "Approximating block accesses in database organizations", S. B. Yao,
>> Communications of the ACM, Volume 20 Issue 4, April 1977 Pages 260-261
Actually, having now looked at both the Deller
Tom Lane wrote:
> > The article text refers to this 1977 S. B. Yao paper "Approximating
> > block accesses in database organizations" which doesn't appear to be
> > available online, except behind ACM's paywall at
> > http://dl.acm.org/citation.cfm?id=359475
>
> Well, a CACM citation is perfectl
On 31 March 2016 at 21:40, Tom Lane wrote:
> Alvaro Herrera writes:
>> Tom Lane wrote:
>>> Another minor gripe is the use of a random URL as justification. This
>>> code will still be around when that URL exists nowhere but the Wayback
>>> Machine. Can't we find a more formal citation to use?
>
On 31 March 2016 at 20:18, Tom Lane wrote:
> Also, I wonder if it'd be a good idea to provide a guard against division
> by zero --- we know rel->tuples > 0 at this point, but I'm less sure that
> reldistinct can't be zero. In the same vein, I'm worried about the first
> argument of pow() being s
Alvaro Herrera writes:
> Tom Lane wrote:
>> Another minor gripe is the use of a random URL as justification. This
>> code will still be around when that URL exists nowhere but the Wayback
>> Machine. Can't we find a more formal citation to use?
> The article text refers to this 1977 S. B. Yao p
Tom Lane wrote:
> Another minor gripe is the use of a random URL as justification. This
> code will still be around when that URL exists nowhere but the Wayback
> Machine. Can't we find a more formal citation to use?
The article text refers to this 1977 S. B. Yao paper "Approximating
block acce
Dean Rasheed writes:
> On 30 March 2016 at 14:03, Tomas Vondra wrote:
>> Attached is v4 of the patch
> Thanks, I think this is good to go, except that I think we need to use
> pow() rather than powl() because AIUI powl() is new to C99, and so
> won't necessarily be available on all supported pla
On 30 March 2016 at 14:03, Tomas Vondra wrote:
> Attached is v4 of the patch
Thanks, I think this is good to go, except that I think we need to use
pow() rather than powl() because AIUI powl() is new to C99, and so
won't necessarily be available on all supported platforms. I don't
think we need w
Hi,
On 03/22/2016 03:40 PM, Alexander Korotkov wrote:
I think you should send a revision of patch including comments proposed
by Deam Rasheed.
I'm switching patch status to waiting on author in commitfest.
Attached is v4 of the patch - the only difference w.r.t. v3 is that I've
used the com
Hi Thomas,
On 3/22/16 10:40 AM, Alexander Korotkov wrote:
I think you should send a revision of patch including comments proposed
by Deam Rasheed.
I'm switching patch status to waiting on author in commitfest.
We're getting pretty close to the end of the CF. Do you know when you
will have a
Hi, Tomas!
On Mon, Mar 21, 2016 at 2:39 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:
> On Fri, Mar 18, 2016 at 1:20 PM, Dean Rasheed
> wrote:
>
>> Probably a better URL to give is
>> http://www.adellera.it/investigations/distinct_balls/ which has a link
>> to the PDF version of the
Hi, Dean!
On Fri, Mar 18, 2016 at 1:20 PM, Dean Rasheed
wrote:
> Probably a better URL to give is
> http://www.adellera.it/investigations/distinct_balls/ which has a link
> to the PDF version of the paper and also some supporting material.
>
> However, while that paper is in general very clear,
On 18 March 2016 at 00:37, Tomas Vondra wrote:
>> On Sun, 2016-03-13 at 15:24 +, Dean Rasheed wrote:
>>> I think that a better formula to use would be
>>>
>>> reldistinct *= (1 - powl(1 - rel-rows / rel->tuples, rel->tuples /
>>> reldistinct)
>
> Attached is a v3 of the patch using this formul
On 03/13/2016 11:09 PM, Tomas Vondra wrote:
Hi,
On Sun, 2016-03-13 at 15:24 +, Dean Rasheed wrote:
On 4 March 2016 at 13:10, Tomas Vondra
wrote:
...
>>>
I think that a better formula to use would be
reldistinct *= (1 - powl(1 - rel-rows / rel->tuples, rel->tuples /
reldistinct)
Att
Hi,
On Sun, 2016-03-13 at 15:24 +, Dean Rasheed wrote:
> On 4 March 2016 at 13:10, Tomas Vondra
> wrote:
> >
> > The risk associated with over-estimation is that we may switch from
> > HashAggregate to GroupAggregate, and generally select paths better
> > suited for large number of rows. Not
On 4 March 2016 at 13:10, Tomas Vondra wrote:
> The risk associated with over-estimation is that we may switch from
> HashAggregate to GroupAggregate, and generally select paths better
> suited for large number of rows. Not great, because the plan may not be
> optimal and thus run much slower - bu
On Thu, 2016-03-03 at 11:42 -0800, Mark Dilger wrote:
> > On Mar 3, 2016, at 11:27 AM, Alexander Korotkov
> > wrote:
> >
> > On Thu, Mar 3, 2016 at 10:16 PM, Tomas Vondra
> > wrote:
> > So yes, each estimator works great for exactly the opposite cases. But
> > notice that typically, the resul
> On Mar 3, 2016, at 11:27 AM, Alexander Korotkov
> wrote:
>
> On Thu, Mar 3, 2016 at 10:16 PM, Tomas Vondra
> wrote:
> So yes, each estimator works great for exactly the opposite cases. But notice
> that typically, the results of the new formula is much higher than the old
> one, sometimes
On Thu, Mar 3, 2016 at 10:16 PM, Tomas Vondra
wrote:
> So yes, each estimator works great for exactly the opposite cases. But
> notice that typically, the results of the new formula is much higher than
> the old one, sometimes by two orders of magnitude (and it shouldn't be
> difficult to constru
Hi,
On 03/03/2016 06:40 PM, Mark Dilger wrote:
On Mar 3, 2016, at 8:35 AM, Tomas Vondra wrote:
Hi,
On 03/03/2016 12:53 PM, Alexander Korotkov wrote:
Hi, Tomas!
I've assigned to review this patch.
I've checked version estimate-num-groups-v2.txt by Mark Dilger.
It applies to head cleanly,
On Thu, Mar 3, 2016 at 7:35 PM, Tomas Vondra
wrote:
> On 03/03/2016 12:53 PM, Alexander Korotkov wrote:
>
>> I've assigned to review this patch.
>>
>> I've checked version estimate-num-groups-v2.txt by Mark Dilger.
>> It applies to head cleanly, passes corrected regression tests.
>>
>> About corr
> On Mar 3, 2016, at 8:35 AM, Tomas Vondra wrote:
>
> Hi,
>
> On 03/03/2016 12:53 PM, Alexander Korotkov wrote:
>> Hi, Tomas!
>>
>> I've assigned to review this patch.
>>
>> I've checked version estimate-num-groups-v2.txt by Mark Dilger.
>> It applies to head cleanly, passes corrected regress
Hi,
On 03/03/2016 12:53 PM, Alexander Korotkov wrote:
Hi, Tomas!
I've assigned to review this patch.
I've checked version estimate-num-groups-v2.txt by Mark Dilger.
It applies to head cleanly, passes corrected regression tests.
About correlated/uncorrelated cases. I think now optimizer mostly
Hi, Tomas!
I've assigned to review this patch.
I've checked version estimate-num-groups-v2.txt by Mark Dilger.
It applies to head cleanly, passes corrected regression tests.
About correlated/uncorrelated cases. I think now optimizer mostly assumes
all distributions to be independent.
I think we
Hi,
On 02/26/2016 04:32 AM, Mark Dilger wrote:
On Feb 25, 2016, at 4:59 PM, Tomas Vondra wrote:
...
The culprit here is that the two columns are not independent, but
are (rather strongly) correlated due to the way you've generated
the data.
Yes, that was intentional. Your formula is exa
> On Feb 25, 2016, at 4:59 PM, Tomas Vondra
> wrote:
>
> Hi,
>
> On 02/26/2016 12:16 AM, Mark Dilger wrote:
>>
>> It is not hard to write test cases where your patched version overestimates
>> the number of rows by a very similar factor as the old code underestimates
>> them. My very first t
Hi,
On 02/26/2016 12:16 AM, Mark Dilger wrote:
It is not hard to write test cases where your patched version overestimates
the number of rows by a very similar factor as the old code underestimates
them. My very first test, which was not specifically designed to demonstrate
this, happens to be
> On Feb 25, 2016, at 3:16 PM, Mark Dilger wrote:
>
>
>> On Feb 23, 2016, at 5:12 PM, Tomas Vondra
>> wrote:
>>
>>
>>
>> So much better. Clearly, there are cases where this will over-estimate the
>> cardinality - for example when the values are somehow correlated.
>>
>
> I applied your
> On Feb 23, 2016, at 5:12 PM, Tomas Vondra
> wrote:
>
>
>
> So much better. Clearly, there are cases where this will over-estimate the
> cardinality - for example when the values are somehow correlated.
>
I applied your patch, which caused a few regression tests to fail. Attached
is a pa
33 matches
Mail list logo