On 05/04/2019 23:25, Andres Freund wrote:
I think what's in v12 - I don't know of any non-cleanup / bugfix work
pending for 12 - is a pretty reasonable initial set of features.

Hooray!

- the (optional) bitmap heap scan API - that's fairly intrinsically
   block based. An AM could just internally subdivide TIDs in a different
   way, but I don't think a bitmap scan like we have would e.g. make a
   lot of sense for an index oriented table without any sort of stable
   tid.

If an AM doesn't implement the bitmap heap scan API, what happens? Bitmap scans are disabled?

Even if an AM isn't block-oriented, the bitmap heap scan API still makes sense as long as there's some correlation between TIDs and physical location. The only really broken thing about that currently is the prefetching: nodeBitmapHeapScan.c calls PrefetchBuffer() directly with the TID's block numbers. It would be pretty straightforward to wrap that in a callback, so that the AM could do something different.

Or move even more of the logic to the AM, so that the AM would get the whole TIDBitmap in table_beginscan_bm(). It could then implement the fetching and prefetching as it sees fit.

I don't think it's urgent, though. We can cross that bridge when we get there, with the first AM that needs that flexibility.

The most constraining factor for storage, I think, is that currently the
API relies on ItemPointerData style TIDs in a number of places (i.e. a 6
byte tuple identifier).

I think 48 bits would be just about enough, but it's even more limited than you might at the moment. There are a few places that assume that the offsetnumber <= MaxHeapTuplesPerPage. See ginpostinglist.c, and MAX_TUPLES_PER_PAGE in tidbitmap.c. Also, offsetnumber can't be 0, because that makes the ItemPointer invalid, which is inconvenient if you tried to use ItemPointer as just an arbitrary 48-bit integer.

- Heikki


Reply via email to