On Thu, Mar 19, 2009 at 4:04 AM, ITAGAKI Takahiro
wrote:
> Robert Haas wrote:
>> >> Works for me. Especially if you want to think more about ANALYZE before
>> >> changing that.
>> >
>> > Well, it's something that would be sane to contemplate adding in 8.4.
>> > It's way too late for any of this o
Robert Haas wrote:
> >> Works for me. Especially if you want to think more about ANALYZE before
> >> changing that.
> >
> > Well, it's something that would be sane to contemplate adding in 8.4.
> > It's way too late for any of this other stuff to happen in this release.
>
> I'm thinking about t
On Fri, Dec 12, 2008 at 3:59 PM, Tom Lane wrote:
> Simon Riggs writes:
>> On Fri, 2008-12-12 at 13:20 -0500, Tom Lane wrote:
>>> Simon Riggs writes:
Solutions can also include
* manual mechanism for setting ndistinct that doesn't keep getting
overwritten by subsequent ANALYZEs
>>>
On Thu, 11 Dec 2008, Tom Lane wrote:
On the whole I think we have some evidence here to say that upping the
default value of default_stats_target to 100 wouldn't be out of line,
but 1000 definitely is. Comments?
Circling back to where this started from now that the discussion has
slowed down
Simon Riggs writes:
> On Fri, 2008-12-12 at 13:20 -0500, Tom Lane wrote:
>> Simon Riggs writes:
>>> Solutions can also include
>>> * manual mechanism for setting ndistinct that doesn't keep getting
>>> overwritten by subsequent ANALYZEs
>>
>> Hmm, that might actually be the most practical answer
On Fri, 2008-12-12 at 13:20 -0500, Tom Lane wrote:
> Simon Riggs writes:
> > Solutions can also include
> > * manual mechanism for setting ndistinct that doesn't keep getting
> > overwritten by subsequent ANALYZEs
>
> Hmm, that might actually be the most practical answer for large,
> reasonably-
Gregory Stark wrote:
Simon Riggs writes:
The amount of I/O could stay the same, just sample all rows on block. []
It will also introduce strange biases. For instance in a clustered table it'll
think there are a lot more duplicates than there really are because it'll see
lots of similar va
On Fri, 2008-12-12 at 13:43 -0500, Tom Lane wrote:
> Simon Riggs writes:
> > On Fri, 2008-12-12 at 13:18 -0500, Tom Lane wrote:
> >> Could we skip the hyperbole please?
>
> > Some of the ndistinct values are very badly off, and in the common cases
> > I cited previously, consistently so.
>
> >
On Fri, Dec 12, 2008 at 6:31 PM, Simon Riggs wrote:
>
> Why not keep the random algorithm we have now, but scan the block into a
> separate hash table for ndistinct estimation. That way we keep the
> correct random rows for other purposes.
It seems to me that what you have to do is look at a set
On Fri, Dec 12, 2008 at 6:18 PM, Tom Lane wrote:
> I seem to recall Greg suggesting that there were ways to estimate
> ndistinct without sorting, but short of a fundamental algorithm change
> there's not going to be a win here.
Not sure if you meant me, but I can say the approach I saw documented
>> Perhaps a table-level option to scan the whole table instead of
>> estimating would be appropriate?
>>
> ANALYZE FULL foo ?
I was thinking more like a flag that you could set on the table
itself, that would apply to all future analyzes.
...Robert
--
Sent via pgsql-hackers mailing list (pgsql
Robert Haas escreveu:
>> Which raises the issue, if we could get better statistics by passing
>> the whole table, why not do that when VACUUM ANALYZE is run?
>
> I think the reason is "because the next autovacuum would undo it".
>
Ok, but even if autovacuum will undo it, almost-static-dataset wou
Simon Riggs writes:
> On Fri, 2008-12-12 at 13:18 -0500, Tom Lane wrote:
>> Could we skip the hyperbole please?
> Some of the ndistinct values are very badly off, and in the common cases
> I cited previously, consistently so.
> Once I'm certain the rescue helicopter has seen me, I'll stop waving
On Fri, 2008-12-12 at 13:18 -0500, Tom Lane wrote:
> I seem to recall Greg suggesting that there were ways to estimate
> ndistinct without sorting, but short of a fundamental algorithm change
> there's not going to be a win here.
Hash table? Haas Stokes suggests a Bloom filter.
Why not keep the
On Fri, 2008-12-12 at 13:10 -0500, Tom Lane wrote:
> If we don't
> have at least ballpark-correct figures for cost and number of output
> rows, we'll start making mistakes at higher plan levels.
That's definitely where we are now.
Had a call with a customer today where I said "it's OK, its onl
Simon Riggs writes:
> Solutions can also include
> * manual mechanism for setting ndistinct that doesn't keep getting
> overwritten by subsequent ANALYZEs
Hmm, that might actually be the most practical answer for large,
reasonably-static tables. Especially if we expose the "negative
stadistinct"
Simon Riggs writes:
> As I said, we would only increase sample for ndistinct, not for others.
How will you do that? Keep in mind that one of the things we have to do
to compute ndistinct is to sort the sample. ISTM that the majority of
the cost of a larger sample is going to get expended anyway
On Fri, 2008-12-12 at 18:01 +, Greg Stark wrote:
> I think you need to find two different formulas, one which represents
> a clustered table and one which represents randomly distributed data.
> Then you need a way to measure just how clustered the data is so you
> know how much weight to giv
Alvaro Herrera wrote:
> Tom Lane escribi?:
>
> > If you want ANALYZE to be cheap then you simply don't get to have a
> > trustworthy value of ndistinct.
>
> But then, maybe it's not all that critical that ANALYZE is cheap. For
> example, if we were to rework VACUUM ANALYZE so that on the same pa
Alvaro Herrera wrote:
> Robert Haas escribi?:
> > > Which raises the issue, if we could get better statistics by passing
> > > the whole table, why not do that when VACUUM ANALYZE is run?
> >
> > I think the reason is "because the next autovacuum would undo it".
>
> Is there any way to "merge" th
Alvaro Herrera writes:
> Is there any way to "merge" the statistics? i.e. if a full table scan
> is done to compute precise statistics, and later a regular analyze scan
> is done, then perhaps instead of clobbering the previous stats, you
> merge them with the new ones, thus not completely losing
Simon Riggs writes:
> On Fri, 2008-12-12 at 11:16 -0500, Tom Lane wrote:
>> Perhaps a better plan is to try to de-emphasize use of ndistinct,
>> though I concede I have no idea how to do that.
> We don't actually care about the accuracy of the ndistinct much, just
> the accuracy of our answer to
On Fri, 2008-12-12 at 14:03 -0300, Alvaro Herrera wrote:
> Partial vacuum is a roadblock for this though :-(
Actually, perhaps its an enabler for looking at changed stats?
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
--
Sent via pgsql-hackers mail
On Fri, Dec 12, 2008 at 5:33 PM, Simon Riggs wrote:
>> Incidentally we *do* do block sampling. We pick random blocks and then pick
>> random records within those blocks. This was new in, uh, 7.4? 8.0? Sometime
>> around then. It dramatically reduced the i/o requirements but there were long
>> disc
Alvaro Herrera writes:
> Tom Lane escribió:
>> If you want ANALYZE to be cheap then you simply don't get to have a
>> trustworthy value of ndistinct.
> But then, maybe it's not all that critical that ANALYZE is cheap. For
> example, if we were to rework VACUUM ANALYZE so that on the same pass
>
"Robert Haas" writes:
> On Fri, Dec 12, 2008 at 4:04 AM, Simon Riggs wrote:
>>> The existing sampling mechanism is tied to solid statistics.
>>
>> Sounds great, but its not true. The sample size is not linked to data
>> volume, so how can it possibly give a consistent confidence range?
> It is
On Fri, 2008-12-12 at 17:05 +, Gregory Stark wrote:
> Simon Riggs writes:
>
> > The amount of I/O could stay the same, just sample all rows on block.
> > Lifting the sample size will help large tables. Will it be perfect? No.
> > But I'll take "better" over "not working at all".
>
> That wi
>>> "Robert Haas" wrote:
>> Which raises the issue, if we could get better statistics by
passing
>> the whole table, why not do that when VACUUM ANALYZE is run?
>
> I think the reason is "because the next autovacuum would undo it".
The table has 32.4 million rows.
autovacuum_analyze_scale_fac
On Fri, 2008-12-12 at 16:10 +, Gregory Stark wrote:
> Uhm, this is a survey of lots of different methods and does lots of
> analysis.
> I don't see any simple conclusions about stability. Perhaps I'm just
> missing
> it in the technical details. Could you point out exactly what part of
> the
>
Simon Riggs writes:
> The amount of I/O could stay the same, just sample all rows on block.
> Lifting the sample size will help large tables. Will it be perfect? No.
> But I'll take "better" over "not working at all".
That will just raise the table size at which the problems start. It'll still
b
Tom Lane escribió:
> If you want ANALYZE to be cheap then you simply don't get to have a
> trustworthy value of ndistinct.
But then, maybe it's not all that critical that ANALYZE is cheap. For
example, if we were to rework VACUUM ANALYZE so that on the same pass
that VACUUM cleans each heap page
Robert Haas escribió:
> > Which raises the issue, if we could get better statistics by passing
> > the whole table, why not do that when VACUUM ANALYZE is run?
>
> I think the reason is "because the next autovacuum would undo it".
Is there any way to "merge" the statistics? i.e. if a full table
On Fri, 2008-12-12 at 11:16 -0500, Tom Lane wrote:
> Perhaps a better plan is to try to de-emphasize use of ndistinct,
> though I concede I have no idea how to do that.
We don't actually care about the accuracy of the ndistinct much, just
the accuracy of our answer to the question "given work_me
On Fri, 2008-12-12 at 11:16 -0500, Tom Lane wrote:
> If you want ANALYZE to be cheap then you simply don't get to have
> a trustworthy value of ndistinct.
Understood, but a cheap ANALYZE isn't always a higher priority than all
other considerations.
Solutions can also include
* allowing user t
On Fri, 2008-12-12 at 16:10 +, Gregory Stark wrote:
> Right, but increasing our sample size by a factor of 150 for a 100M
> row table doesn't seem like a reasonable solution to one metric being
> bogus.
>
> For that matter, if we do consider sampling 5% of the table we may as
> well just go a
> Which raises the issue, if we could get better statistics by passing
> the whole table, why not do that when VACUUM ANALYZE is run?
I think the reason is "because the next autovacuum would undo it".
Perhaps a table-level option to scan the whole table instead of
estimating would be appropriate?
>>> "Nathan Boley" wrote:
> Can anyone suggest a good data set to test this sort of question on?
Where we have the biggest problem with bad estimates is on complex
searches involving many joins where the main criterion limiting the
result set is a name. The estimate based on the histogram is o
Gregory Stark writes:
> For that matter, if we do consider sampling 5% of the table we may as well
> just go ahead and scan the whole table. It wouldn't take much longer and it
> would actually produce good estimates.
Yeah. Anything over a small fraction of a percent is going to imply
fetching e
Simon Riggs writes:
> On Fri, 2008-12-12 at 06:44 -0500, Robert Haas wrote:
>> It is a pretty well-known mathematical fact that for something like an
>> opinion poll your margin of error does not depend on the size of the
>> population but only on the size of your sample.
>
> Yes, I agree with th
"Greg Stark" writes:
> On Fri, Dec 12, 2008 at 2:35 PM, Tom Lane wrote:
>> But having said that, I have wondered whether we should consider
>> allowing the sample to grow to fill maintenance_work_mem
> Hm, so I wonder what this does to the time analyze takes. I think it
> would be the only thing
On Fri, 2008-12-12 at 09:35 -0500, Tom Lane wrote:
> AFAICS, marginal enlargements in the sample size aren't going to help
> much for ndistinct --- you really need to look at most or all of the
> table to be guaranteed anything about that.
>
> But having said that, I have wondered whether we sho
On Fri, 2008-12-12 at 06:44 -0500, Robert Haas wrote:
> On Fri, Dec 12, 2008 at 4:04 AM, Simon Riggs wrote:
> >> The existing sampling mechanism is tied to solid statistics. It
> >> provides the correct sample size to get a consistent confidence range
> >> for range queries. This is the same math
"Robert Haas" writes:
> On Thu, Dec 11, 2008 at 10:12 PM, Tom Lane wrote:
>> Maybe so. If we stick to the other design (end both lists at a preset
>> frequency threshold) then the math clearly goes through the same as
>> before, just with num_mcvs that are determined differently. But can
>> we
On Fri, Dec 12, 2008 at 2:35 PM, Tom Lane wrote:
> AFAICS, marginal enlargements in the sample size aren't going to help
> much for ndistinct --- you really need to look at most or all of the
> table to be guaranteed anything about that.
Well you only need to maintain a fixed percentage of the ta
On Thu, Dec 11, 2008 at 10:12 PM, Tom Lane wrote:
> "Robert Haas" writes:
>> I had this idle thought too, but I didn't write it down because...
>
>>> ought to be, but it seems like it ought to be possible to determine
>>> that given a desired maximum error in the overall estimate. I'm also
>>> n
On Fri, Dec 12, 2008 at 4:04 AM, Simon Riggs wrote:
>> The existing sampling mechanism is tied to solid statistics. It
>> provides the correct sample size to get a consistent confidence range
>> for range queries. This is the same mathematics which governs election
>> polling and other surveys. Th
On Thu, 2008-12-11 at 18:52 -0500, Tom Lane wrote:
> Simon Riggs writes:
> > On Thu, 2008-12-11 at 22:29 +, Gregory Stark wrote:
> >>> And I would like it even more if the sample size increased according
> >>> to table size, since that makes ndistinct values fairly random for
> >>> large tabl
On Fri, 2008-12-12 at 02:23 +, Greg Stark wrote:
> The existing sampling mechanism is tied to solid statistics. It
> provides the correct sample size to get a consistent confidence range
> for range queries. This is the same mathematics which governs election
> polling and other surveys. The
"Robert Haas" writes:
> I had this idle thought too, but I didn't write it down because...
>> ought to be, but it seems like it ought to be possible to determine
>> that given a desired maximum error in the overall estimate. I'm also
>> not very clear on what the "total frequency" computations (
>> OK, I'll bite. How do we decide where to put the cutoff? If we make
>> it too high, it will have a negative effect on join selectivity
>> estimates; if it's too low, it won't really address the problem we're
>> trying to fix. I randomly propose p = 0.001, which should limit
>> eqjoinsel() to
>> Isn't a selectivity estimate of x = v as ( the number of values in v's
>> histogram bucket ) / ( number of distinct values in v's histogram
>> bucket ) pretty rational? Thats currently what we do for non-mcv
>> values, except that we look at ndistinct over the whole table instead
>> of individua
On Thu, Dec 11, 2008 at 11:44 PM, Simon Riggs wrote:
>
> On Thu, 2008-12-11 at 22:29 +, Gregory Stark wrote:
>
>> > And I would like it even more if the sample size increased according
>> to table size, since that makes ndistinct values fairly random for
>> large
>> > tables.
>>
>> Unfortunate
"Robert Haas" writes:
>> It might be best to stop when the frequency drops below some threshold,
>> rather than taking a fixed number of entries.
> OK, I'll bite. How do we decide where to put the cutoff? If we make
> it too high, it will have a negative effect on join selectivity
> estimates;
> I think though that the case for doing so is pretty good. "MCVs" that
> are beyond the K'th entry can't possibly have frequencies greater than
> 1/K, and in most cases it'll be a lot less. So the incremental
> contribution to the accuracy of the join selectivity estimate drops off
> pretty quic
"Nathan Boley" writes:
> Isn't a selectivity estimate of x = v as ( the number of values in v's
> histogram bucket ) / ( number of distinct values in v's histogram
> bucket ) pretty rational? Thats currently what we do for non-mcv
> values, except that we look at ndistinct over the whole table ins
Thanks for the response.
>> Well, ISTM there is a profound difference. For scalarineqsel we care
>> about the total number of values in a bucket. For eqsel we care about
>> the total number of *distinct* values in each bucket
>
> Really?
>
Well, we would obviously also care about the total numbe
"Nathan Boley" writes:
> Well, ISTM there is a profound difference. For scalarineqsel we care
> about the total number of values in a bucket. For eqsel we care about
> the total number of *distinct* values in each bucket
Really?
> IMHO, the whole idea of increasing mcv's seems a mistake. Why no
>> What is the specific difference between what you are talking about and
>> what scalarineqsel already implements?
>
> Hmm... Northing new. Feel sorry for bothering you. I did not realize
> histograms are implemented.
>
Well, ISTM there is a profound difference. For scalarineqsel we care
about th
Bruce Momjian writes:
> Why is selfuncs.c::var_eq_const() doing a linear scan over the MCV array
> instead of having the list sorted and doing a binary search on the
> array?
1. Keeping the MCV array sorted by frequency allows cheap extraction
of less-accurate MCV lists; you just take the first N
Simon Riggs writes:
> On Thu, 2008-12-11 at 22:29 +, Gregory Stark wrote:
>>> And I would like it even more if the sample size increased according
>>> to table size, since that makes ndistinct values fairly random for
>>> large tables.
>>
>> Unfortunately _any_ ndistinct estimate based on a s
Tom Lane wrote:
> "Robert Haas" writes:
> >> On the whole I think we have some evidence here to say that upping the
> >> default value of default_stats_target to 100 wouldn't be out of line,
> >> but 1000 definitely is. Comments?
>
> > Do you think there's any value in making it scale based on t
>>> Simon Riggs wrote:
> On Thu, 2008-12-11 at 17:45 -0500, Tom Lane wrote:
>> I don't see
>> much value in a data-type-dependent default anyway
I am dubious about the value there, too, at least for our environment.
> I would prefer distinct type or domain specific defaults
Now that would
On Thu, 2008-12-11 at 22:29 +, Gregory Stark wrote:
> > And I would like it even more if the sample size increased according
> to table size, since that makes ndistinct values fairly random for
> large
> > tables.
>
> Unfortunately _any_ ndistinct estimate based on a sample of the table
> is
Simon Riggs writes:
> On Thu, 2008-12-11 at 17:45 -0500, Tom Lane wrote:
>> Simon Riggs writes:
>>> I would like it even more if there was a data type specific default.
>>> Currently we have a special case for boolean, but that's it.
>>
>> No, we don't (or if we do I'd be interested to know wher
"Robert Haas" writes:
> On Thu, Dec 11, 2008 at 5:45 PM, Tom Lane wrote:
>> Simon Riggs writes:
>>> I would like it even more if there was a data type specific default.
>>> Currently we have a special case for boolean, but that's it.
>>
>> No, we don't (or if we do I'd be interested to know whe
On Thu, 2008-12-11 at 17:45 -0500, Tom Lane wrote:
> Simon Riggs writes:
> > I would like it even more if there was a data type specific default.
> > Currently we have a special case for boolean, but that's it.
>
> No, we don't (or if we do I'd be interested to know where).
Your commit, selfun
On Thu, Dec 11, 2008 at 5:45 PM, Tom Lane wrote:
> Simon Riggs writes:
>> I would like it even more if there was a data type specific default.
>> Currently we have a special case for boolean, but that's it.
>
> No, we don't (or if we do I'd be interested to know where).
utils/adt/selfuncs.c, get
Gregory Stark writes:
> Tom Lane writes:
>> BTW, does anyone have an opinion about changing the upper limit for
>> default_stats_target to, say, 1? These tests suggest that you
>> wouldn't want such a value for a column used as a join key, but
>> I can see a possible argument for high values
Tom Lane writes:
> 500 114.076
> 600 157.535
> 700 211.189
> 800 269.731
> 900 335.427
> 1000 409.638
>...
> BTW, does anyone have an opinion about changing the upper limit for
> default_stats_target to, say, 1? These tests suggest that you
> wouldn't want such a value for a colu
>
> What is the specific difference between what you are talking about and
> what scalarineqsel already implements?
>
Hmm... Northing new. Feel sorry for bothering you. I did not realize
histograms are implemented.
Regards,
Vladimir Sitnikov
Simon Riggs writes:
> I would like it even more if there was a data type specific default.
> Currently we have a special case for boolean, but that's it.
No, we don't (or if we do I'd be interested to know where). I don't see
much value in a data-type-dependent default anyway --- would you make
Gregory Stark writes:
> Simon Riggs writes:
>> And I would like it even more if the sample size increased according to
>> table size, since that makes ndistinct values fairly random for large
>> tables.
> Unfortunately _any_ ndistinct estimate based on a sample of the table is going
> to be pret
Simon Riggs writes:
> On Thu, 2008-12-11 at 13:09 -0500, Tom Lane wrote:
>
>> On the whole I think we have some evidence here to say that upping the
>> default value of default_stats_target to 100 wouldn't be out of line,
>> but 1000 definitely is. Comments?
>
> Sounds good to me.
>
> I would li
"Vladimir Sitnikov" writes:
>> I think that would likely be redundant with the histogram.
>>
> I am afraid I do not get you.
AFAICS what you're proposing *is* a histogram, just awkwardly described.
What is the specific difference between what you are talking about and
what scalarineqsel already
On Thu, 2008-12-11 at 13:09 -0500, Tom Lane wrote:
> On the whole I think we have some evidence here to say that upping the
> default value of default_stats_target to 100 wouldn't be out of line,
> but 1000 definitely is. Comments?
Sounds good to me.
I would like it even more if there was a da
>
>
> I think that would likely be redundant with the histogram.
>
I am afraid I do not get you. I mean histograms should be considered when it
comes to increasing number of MCV entries (at least for numeric/timestamp
values). With histogram lower number of entries could be sufficient to get
reason
> One more direction could be implementing "MCV" for range of values (group
> values and interpolate in between). Consider statistics on timestamp column
> that says that for "2008-December" there are as many X rows, for
> "2008-November" as many as Y, etc. That could be used for rather accurate
>
"Vladimir Sitnikov" writes:
> One more direction could be implementing "MCV" for range of values (group
> values and interpolate in between). Consider statistics on timestamp column
> that says that for "2008-December" there are as many X rows, for
> "2008-November" as many as Y, etc. That could
>
>
>
> There's something in what you say, but consider that we have pretty
> much unanimous agreement that 10 is too small. I think we should
> try to fix the problem, not just gradually ratchet up the value until
> people start complaining in the other direction. (Also, we should have
> plenty
"Robert Haas" writes:
>> On the whole I think we have some evidence here to say that upping the
>> default value of default_stats_target to 100 wouldn't be out of line,
>> but 1000 definitely is. Comments?
> Do you think there's any value in making it scale based on the size of
> the table?
As
On Thu, Dec 11, 2008 at 2:06 PM, Tom Lane wrote:
> "Vladimir Sitnikov" writes:
>> Do you consider using hash tables?
>
> Doubt it's really worth it, unless there's some way to amortize the
> setup cost across multiple selectivity estimations; which would surely
> complicate life.
>
> One thing th
>
>
> > Do you consider using hash tables?
>
> Doubt it's really worth it, unless there's some way to amortize the
> setup cost across multiple selectivity estimations; which would surely
> complicate life.
>
MCV lists are updated only during "analyze" phase, don't they? If the "setup
cost" is the
> When looking at these numbers one might think the threshold of pain
> is about 50, rather than 100 which is where I'd put it for the text
> example. However, this is probably an extreme worst case.
>
> On the whole I think we have some evidence here to say that upping the
> default value of defa
"Vladimir Sitnikov" writes:
> Do you consider using hash tables?
Doubt it's really worth it, unless there's some way to amortize the
setup cost across multiple selectivity estimations; which would surely
complicate life.
One thing that just now occurred to me is that as long as we maintain
the c
>
>
> BTW, does anyone have an opinion about changing the upper limit for
> default_stats_target to, say, 1? These tests suggest that you
> wouldn't want such a value for a column used as a join key, but
> I can see a possible argument for high values in text search and
> similar applications.
"Robert Haas" writes:
> Ah, that makes sense. Here's a test case based on Greg's. This is
> definitely more than linear once you get above about n = 80, but it's
> not quadratic either. n = 1000 is only 43x n = 80, and while that's
> surely more than 1000/80 = 12.5, it's also a lot less than (1
> Yeah, that would be fast. To see a quadratic case you need MCV arrays
> that have little or no overlap of common values --- then each element of
> the first will be compared (in vain) to all or most of the elements in
> the second.
Ah, that makes sense. Here's a test case based on Greg's. Thi
"Robert Haas" <[EMAIL PROTECTED]> writes:
> On Mon, Dec 8, 2008 at 10:24 AM, Gregory Stark <[EMAIL PROTECTED]> wrote:
>> I tried a different query, trying to get quadratic growth and again failed.
>> It
> The profiling results I sent the other day show an exactly-linear
> increase in the number o
On Mon, Dec 8, 2008 at 10:24 AM, Gregory Stark <[EMAIL PROTECTED]> wrote:
> I tried a different query, trying to get quadratic growth and again failed. It
The profiling results I sent the other day show an exactly-linear
increase in the number of times eqjoinsel invokes FunctionCall2.
Reading thro
On Mon, Dec 8, 2008 at 11:56 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
> Greg Stark <[EMAIL PROTECTED]> writes:
>> That might only be the case when the pg_statistic record is in shared
>> buffers.
>
> Yeah, it seems unlikely that disabling compression is a good idea in the
> bigger scheme of things.
Greg Stark <[EMAIL PROTECTED]> writes:
> That might only be the case when the pg_statistic record is in shared
> buffers.
Yeah, it seems unlikely that disabling compression is a good idea in the
bigger scheme of things.
> Also I wonder if eqjoinsel and company might need to be made more
> toa
That might only be the case when the pg_statistic record is in shared
buffers.
Also I wonder if eqjoinsel and company might need to be made more
toast-aware by detoasring all the things it needs once rather than
every time it accesses them.
greg
On 6 Dec 2008, at 06:19 PM, "Robert Haas"
Sorry for top posting but we are getting a bit far afield from the
original topic. I followed up the tests I did last night:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00369.php
I benchmarked 100 iterations of EXPLAIN on the query Greg Stark put
together as a synthetic benchmark for
93 matches
Mail list logo