Looks a bit better now, though I think you need to think more about the encapsulation of the structs. More detailed comments below.

Jie Zhang wrote:
Essentially, we want to have a stream bitmap object that has an iterator,
which will be able to iterate through the bitmaps. The BitmapIndexscan,
BitmapAnd, BitmapOr will be executed once and return a streamp bitmap or a
hash bitmap. The BitmapHeapscan then calls tbm_iterate() to iterate through
the bitmaps.

The StreamBitmap structure will look like below.

struct StreamBitmap {
NodeTag type; /* to make it a valid Node */
PagetableEntry entry; /* a page of tids in this stream bitmap */

/* the iterator function */
void (*next)(StreamBitmap*);
Node* state; /* store how this stream bitmap generated,
and all necessary information to
obtain the next stream bitmap. */
};

I'd suggest making state just a (void *). It's private to the producer of the bitmap, and I don't see a reason to expose it. I assume that the next-function fills in the PageTableEntry with the next set of tids.

Two new state objects will look like below. At the same time, we introduce
three new node types: T_StreamBitmapAND, T_StreamBitmapOR,
And T_StreamBitmapIndex, to define different states.

/*
* Stores the necessary information for iterating through the stream bitmaps
* generated by nodeBitmapAnd or nodeBitmapOr.
*/
struct StreamBitmapOp {
NodeTag type; /* handles T_StreamBitmapAND and T_StreamBitmapOR */
List* bitmaps;
};

AFAICS, this struct is private to tidbitmap.c, where the new stream-enabled tbm_intersect etc. functions are defined. Also, I don't see a reason why it needs to by a valid Node.

/*
* Stores some necessary information for iterating through the stream
* bitmaps generated by nodeBitmapIndexscan.
*/
struct StreamBitmapIndex {
NodeTag type; /* handle T_StreamBitmapIndex */
IndexScanDesc scan;
BlockNumber nextBlockNo;/* next block no to be read */
};

Where would this struct be defined? I think different index access methods might want to have different kind of states. The struct above assumes that the position of an index scan is always represented by the nextBlockNo. That seems to be the right thing for the bitmap indexam, so this struct is fine for StreamBitmaps returned by bmgetbitmap, but not necessary for others.

Then we will have the iterator functions like the following:

...

void StreamBitmapIndexNext(StreamBitmap* node) {
StreamBitmapIndex* sbi = (StreamBitmapIndex*) node->state;
amgetbitmap(sbi->scan, NULL, sbi->nextBlockNo);
}

This means that the amgetbitmap function would still be called many times in each scan. What would amgetbitmap return? A new StreamBitmap on each call?

I'd like to see just one call to amgetbitmap. It would a) fill in the hash bitmap passed to it, b) return a new hash bitmap, or c) return a new StreamBitmap, with a indexam specific next-function that returns the tids one page at a time. I think we'll also need some kind of a destructor in StreamBitmap that's called by the consumer of the bitmap after it's done with it.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Reply via email to