On Mon, Feb 23, 2015 at 7:11 AM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > On 28.1.2015 05:03, Abhijit Menon-Sen wrote: > > At 2015-01-27 17:00:27 -0600, jim.na...@bluetreble.com wrote: > >> > Otherwise, the code looks OK to me. Now, there are a few features I'd > like to have for production use (to minimize the impact): > > 1) no index support :-( > > I'd like to see support for more relation types (at least btree > indexes). Are there any plans for that? Do we have an idea on how to > compute that? > > 2) sampling just a portion of the table > > For example, being able to sample just 5% of blocks, making it less > obtrusive, especially on huge tables. Interestingly, there's a > TABLESAMPLE patch in this CF, so maybe it's possible to reuse some > of the methods (e.g. functions behind SYSTEM sampling)? > > 3) throttling > > Another feature minimizing impact of running this on production might > be some sort of throttling, e.g. saying 'limit the scan to 4 MB/s' > or something along those lines. >
I think these features could be done separately if anybody is interested. The patch in its proposed form seems useful to me. > 4) prefetch > > fbstat_heap is using visibility map to skip fully-visible pages, > which is nice, but if we skip too many pages it breaks readahead > similarly to bitmap heap scan. I believe this is another place where > effective_io_concurrency (i.e. prefetch) would be appropriate. > Good point. We can even think of using the technique used by Vacuum which is skip only when we can skip atleast SKIP_PAGES_THRESHOLD pages. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com