On 2/25/15 2:56 PM, Tomas Vondra wrote:
On 24.2.2015 19:08, Jim Nasby wrote:
On 2/22/15 8:32 PM, Tomas Vondra wrote:
On 23.2.2015 03:20, Jim Nasby wrote:
On 2/22/15 5:41 PM, Tomas Vondra wrote:
Otherwise, the code looks OK to me. Now, there are a few features I'd
like to have for production use (to minimize the impact):

1) no index support:-(

      I'd like to see support for more relation types (at least btree
      indexes). Are there any plans for that? Do we have an idea on
how to
      compute that?

It'd be cleaner if had actual an actual am function for this, but see
below.

2) sampling just a portion of the table

      For example, being able to sample just 5% of blocks, making it
less
      obtrusive, especially on huge tables. Interestingly, there's a
      TABLESAMPLE patch in this CF, so maybe it's possible to reuse some
      of the methods (e.g. functions behind SYSTEM sampling)?

3) throttling

      Another feature minimizing impact of running this on production
might
      be some sort of throttling, e.g. saying 'limit the scan to 4 MB/s'
      or something along those lines.

4) prefetch

      fbstat_heap is using visibility map to skip fully-visible pages,
      which is nice, but if we skip too many pages it breaks readahead
      similarly to bitmap heap scan. I believe this is another place
where
      effective_io_concurrency (i.e. prefetch) would be appropriate.

All of those wishes are solved in one way or another by vacuum and/or
analyze. If we had a hook in the tuple scanning loop and at the end of
vacuum you could just piggy-back on it. But really all we'd need for
vacuum to be able to report this info is one more field in LVRelStats, a
call to GetRecordedFreeSpace for all-visible pages, and some logic to
deal with pages skipped because we couldn't get the vacuum lock.

Should we just add this to vacuum instead?

Possibly. I think the ultimate goal is to be able to get this info
easily and without disrupting the system performance too much (which is
difficult without sampling/throttling). If we can stuff that into
autovacuum reasonably, and then get the info from catalogs, I'm OK with
that.

Doing the counting in vacuum/analyze (auto or not) is quite easy, and it
would happen at the same time we're doing useful work. We would
automatically get the benefit of the throttling and sampling work that
those routines already do.

However I'm not sure putting this into autovacuum is actually possible,
because how do you merge data from multiple partial runs (when each of
them skipped different pages)?

ISTM that's just a form of sampling, no?

Maybe.

I was thinking about collecting the necessary info during the VACUUM
phase, and somehow keeping track of free space in the whole table. I
thought there would be trouble exactly because this phase only processes
pages that possibly need vacuuming (so it wouldn't be a truly random
sample, making the estimation tricky).

Well, all-visible pages can still be listed in the FSM, so we'd have at least that info. I don't think there can be dead tuples on an all-visible page; if so, that means free space is all we care about. So when pulling bloat info we'd just have to scan the VSM and FSM; presumably that's pretty cheap.

But maybe that's not really true and it is possible to do that somehow.
For example what if we kept track of how much space each VACUUM freed,
and keeping running sum?

Well, then we'd also have to keep track of space we used. I don't think we want to do that.

On a completely unrelated note, the idea of logging details about vacuum runs in a table is appealing. Forcing users to something like pgBadger for that data seems pretty silly to me. But that's obviously a completely separate discussion from this one.

It might also be done during the ANALYZE, but that seems a bit
complicated because that's based on a sample of rows, not pages.

Right, but you've got the page anyway so I doubt it'd cost much extra to measure the bloat on it. Might want to guard against counting a page twice, but...

Also,
the autovacuum_analyze_factor is 0.2 by default, so could end up with up
to 20% bloat without knowing it (vacuum_factor=0.1 is not great either,
but it's better).

... I don't think it's practical to get this bloat measurement terribly precise, no do I think we need to. Anyone that needs better than 20% accuracy can probably afford to fire off a manual vacuum at the same time (or run a scan-only version).

Besides, we don't need the same lock for figuring out bloat. We
could still measure bloat even if we can't vacuum the page, but I
think that's overkill. If we're skipping enough pages to mess with
the bloat measurement then we most likely need to teach vacuum how to
revisit pages.

Also, autovacuum is not the only place
where we free space - we'd have to handle HOT for example, I guess.

I wasn't thinking about trying to keep live bloat statistics, so HOT
wouldn't affect this.

I'm afraid this might cause the estimate to drift away over time, but I
guess it depends on implementation - e.g. if doing this in ANALYZE, it'd
be mostly immune to this, while with collecting incremental info during
VACUUM it might be a problem I guess.

Yeah, I just don't see the need for that level of accuracy.

Anyway, we don't have a patch trying to do that (certainly not in this
CF). I think it makes sense to add fastbloat() to pageinspect. Maybe
we'll get autovacuum-based solution in the future, but we don't have
that right now.

I still think we could add this as an option to at least vacuum in this CF. It should be far less code and gets us throttling for free. It wouldn't be hard to add a "BLOAT ONLY" option to bypass all the other work; it'd still be less code.

That said, I don't want to block this; I think it's useful. Though, perhaps it would be better as an extension instead of in contrib? I don't think it should be very version dependent?

Actually, wouldn't that be a nice GSoC project? The scope seems about
right, not touching too many parts of the code base, etc.

In it's simplest form I think it'd be too small, but if we got more advanced than simply adding some counters to vacuum then I agree.

I think I'd rather gave a system for logging important stuff (like vacuum stats) in tables with a way to prevent infinite growth though. ;)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to