I see, thanks.

I'm looking into the source code of statistic part now, and I'm a little
confused about the column "staop" presented in table pg_statistic,
in the pg_statisitc.h, the comment says:

/* ----------------
 * To allow keeping statistics on different kinds of datatypes,
 * we do not hard-wire any particular meaning for the remaining
 * statistical fields. Instead, we provide several "slots" in which
 * statistical data can be placed. Each slot includes:
 * kind integer code identifying kind of data (see below)
 * op OID of associated operator, if needed
 * numbers float4 array (for statistical values)
 * values anyarray (for representations of data values)
 * The ID and operator fields are never NULL; they are zeroes in an
 * unused slot.  The numbers and values fields are NULL in an unused
 * slot, and might also be NULL in a used slot if the slot kind has
 * no need for one or the other.
 * ----------------
 */
And,
//line 194 : In a "most common values" slot, staop is the OID of the "="
operator used to decide whether values are the same or not.
//line 206 : A "histogram" slot describes the distribution of scalar data.
 staop is the OID of the "<" operator that describes the sort ordering.
....

I don't understand the function of staop here, how is it used in optimizer,
is there any example ? thanks!



2014/1/10 Amit Langote <amitlangot...@gmail.com>

> On Fri, Jan 10, 2014 at 11:19 PM, Atri Sharma <atri.j...@gmail.com> wrote:
> >
> >
> > Sent from my iPad
> >
> > On 10-Jan-2014, at 19:42, "ygnhzeus" <ygnhz...@gmail.com> wrote:
> >
> > Thanks for your reply.
> > So correlation is not related to the calculation of selectivity right?
> If I
> > force PostgreSQL not to optimize the join order (by setting
> > join_collapse_limit and from_collapse_limit  to 1) , is there any other
> > factor that may affect the structure of execution plan regardless of the
> > data access method.
> >
> > 2014-01-10
> > ________________________________
> > ygnhzeus
> > ________________________________
> > 发件人:Amit Langote <amitlangot...@gmail.com>
> > 发送时间:2014-01-10 22:00
> > 主题:Re: [GENERAL] How to specify/mock the statistic data of tables in
> > PostgreSQL
> > 收件人:"ygnhzeus"<ygnhz...@gmail.com>
> > 抄送:"pgsql-general"<pgsql-general@postgresql.org>
> >
> >
> >
> > AFAIK, correlation is involved in calculation of the costs that are used
> for
> > deciding the type of access.If the correlation is low, index scan can
> lead
> > to quite some random reads, hence leading to higher costs.
> >
>
> Ah, I forgot to mention this point about how planner uses correlation
> for access method selection.
>
> And selectivity is a function of statistical distribution of column
> values described in pg_statistic by histograms, most common values
> (with their occurrence frequencies), number of distinct values, etc.
> It has nothing to do with correlation.
>
> --
> Amit Langote
>

Reply via email to