On Mon, Dec 5, 2016 at 7:44 AM, Peter Geoghegan <p...@heroku.com> wrote:

> On Sat, Dec 3, 2016 at 7:23 PM, Tomas Vondra
> <tomas.von...@2ndquadrant.com> wrote:
> > I do share your concerns about unpredictable behavior - that's
> > particularly worrying for pg_restore, which may be used for time-
> > sensitive use cases (DR, migrations between versions), so unpredictable
> > changes in behavior / duration are unwelcome.
>
> Right.
>
> > But isn't this more a deficiency in pg_restore, than in CREATE INDEX?
> > The issue seems to be that the reltuples value may or may not get
> > updated, so maybe forcing ANALYZE (even very low statistics_target
> > values would do the trick, I think) would be more appropriate solution?
> > Or maybe it's time add at least some rudimentary statistics into the
> > dumps (the reltuples field seems like a good candidate).
>
> I think that there is a number of reasonable ways of looking at it. It
> might also be worthwhile to have a minimal ANALYZE performed by CREATE
> INDEX directly, iff there are no preexisting statistics (there is
> definitely going to be something pg_restore-like that we cannot fix --
> some ETL tool, for example). Perhaps, as an additional condition to
> proceeding with such an ANALYZE, it should also only happen when there
> is any chance at all of parallelism being used (but then you get into
> having to establish the relation size reliably in the absence of any
> pg_class.relpages, which isn't very appealing when there are many tiny
> indexes).
>
> In summary, I would really like it if a consensus emerged on how
> parallel CREATE INDEX should handle the ecosystem of tools like
> pg_restore, reindexdb, and so on. Personally, I'm neutral on which
> general approach should be taken. Proposals from other hackers about
> what to do here are particularly welcome.
>
>
Moved to next CF with "needs review" status.


Regards,
Hari Babu
Fujitsu Australia

Reply via email to