On Thu, Jun 22, 2017 at 5:47 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Mon, Jun 12, 2017 at 9:50 PM, Haribabu Kommi > <kommi.harib...@gmail.com> wrote: > > Open Items: > > > > 1. The BitmapHeapScan and TableSampleScan are tightly coupled with > > HeapTuple and HeapScanDesc, So these scans are directly operating > > on those structures and providing the result. > > > > These scan types may not be applicable to different storage formats. > > So how to handle them? > > I think that BitmapHeapScan, at least, is applicable to any table AM > that has TIDs. It seems to me that in general we can imagine three > kinds of table AMs: > > 1. Table AMs where a tuple can be efficiently located by a real TID. > By a real TID, I mean that the block number part is really a block > number and the item ID is really a location within the block. These > are necessarily quite similar to our current heap, but they can change > the tuple format and page format to some degree, and it seems like in > many cases it should be possible to plug them into our existing index > AMs without too much heartache. Both index scans and bitmap index > scans ought to work. > > 2. Table AMs where a tuple has some other kind of locator. For > example, imagine an index-organized table where the locator is the > primary key, which is a bit like what Alvaro had in mind for indirect > indexes. If the locator is 6 bytes or less, it could potentially be > jammed into a TID, but I don't think that's a great idea. For things > like int8 or numeric, it won't work at all. Even for other things, > it's going to cause problems because the bit patterns won't be what > the code is expecting; e.g. bitmap scans care about the structure of > the TID, not just how many bits it is. (Due credit: Somebody, maybe > Alvaro, pointed out this problem before, at PGCon.) For these kinds > of tables, larger modifications to the index AMs are likely to be > necessary, at least if we want a really general solution, or maybe we > should have separate index AMs - e.g. btree for traditional TID-based > heaps, and generic_btree or indirect_btree or key_btree or whatever > for heaps with some other kind of locator. It's not too hard to see > how to make index scans work with this sort of structure but it's very > unclear to me whether, or how, bitmap scans can be made to work. > > 3. Table AMs where a tuple doesn't really have a locator at all. In > these cases, we can't support any sort of index AM at all. When the > table is queried, there's really nothing the core system can do except > ask the table AM for a full scan, supply the quals, and hope the table > AM has some sort of smarts that enable it to optimize somehow. For > example, you can imagine converting cstore_fdw into a table AM of this > sort - ORC has a sort of inbuilt BRIN-like indexing that allows whole > chunks to be proven uninteresting and skipped. (You could use chunk > number + offset to turn this into a table AM of the previous type if > you wanted to support secondary indexes; not sure if that'd be useful, > but it'd certainly be harder.) > > I'm more interested in #1 than in #3, and more interested in #3 than > #2, but other people may have different priorities. Hi Robert, Thanks for the details and your opinion. I also agree that option#1 is better to do first. Regards, Hari Babu Fujitsu Australia