Re: [HACKERS] Bitmap index status

2006-10-21 Thread Bruce Momjian
Gavin Sherry wrote:
 On Wed, 18 Oct 2006, Heikki Linnakangas wrote:
 
  Hi,
 
  I don't want to harass you :), but what's the status with the bitmap
  index code? Is there something I can do to help?
 
 
 Hi Heikki,
 
 The streaming is implemented, as are range queries. I need to bring it up
 to HEAD and back-patch to bizgres since... it's not diverged fairly
 significantly from that code base.
 
 Two outstanding items are handling vacuum and I was considering having a
 bitmap selectivity function but I haven't really looked into it.
 
 Once I bring it up to HEAD I'll post.

Code promises for this feature have been unreliable in the past.  :-(

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Bitmap index status

2006-10-18 Thread Heikki Linnakangas

Hi,

I don't want to harass you :), but what's the status with the bitmap 
index code? Is there something I can do to help?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Bitmap index status

2006-10-18 Thread Jie Zhang

On 10/18/06 2:41 AM, Heikki Linnakangas [EMAIL PROTECTED] wrote:

 Hi,
 
 I don't want to harass you :), but what's the status with the bitmap
 index code? Is there something I can do to help?

Not at all. We will send you the new patch soon. Your comments are most
appreciated.

Thanks,
Jie



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Bitmap index status

2006-10-18 Thread Gavin Sherry
On Wed, 18 Oct 2006, Heikki Linnakangas wrote:

 Hi,

 I don't want to harass you :), but what's the status with the bitmap
 index code? Is there something I can do to help?


Hi Heikki,

The streaming is implemented, as are range queries. I need to bring it up
to HEAD and back-patch to bizgres since... it's not diverged fairly
significantly from that code base.

Two outstanding items are handling vacuum and I was considering having a
bitmap selectivity function but I haven't really looked into it.

Once I bring it up to HEAD I'll post.

Thanks,

Gavin


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Bitmap index status

2006-09-28 Thread Mark Wong

Luke Lonergan wrote:

Mark,

On 9/25/06 11:32 AM, Mark Wong [EMAIL PROTECTED] wrote:


Yeah, basically gather as many stats as I can to accurately profile the
overall system performance.  I thought it would be appropriate to use a
TPC-H based workload as one measuring stick to use for bitmap indexes.


Note that the TPC-H queries don't follow the typical good use case for
bitmap indexes.  You'd like to see queries that use multiple AND and OR
clauses, otherwise there may be no benefit.


Oh right, people keep telling me that and it keeps going in one ear and 
out the other...



Also, DBT-3/TPC-H on Postgres right now does not benefit from indices
overall.  The planner has limitations WRT selectivity estimates and other
limitations that cause it to choose index access poorly for the query
workload.  We have two new features coming (for 8.3) that fix this, but for
now we find that indexes are a net loss, in some queries a huge loss.


Great, I'll keep my eye for those. :)


If you look at the whitepaper that Ayush Parashar published, he uses the
TPC-H data with some targeted queries that showcase the best use-cases for
bitmap index.


Mark

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Bitmap index status

2006-09-26 Thread Gavin Sherry
On Tue, 26 Sep 2006, Heikki Linnakangas wrote:

 Looks a bit better now, though I think you need to think more about the
 encapsulation of the structs. More detailed comments below.

 Jie Zhang wrote:
  Essentially, we want to have a stream bitmap object that has an iterator,
  which will be able to iterate through the bitmaps. The BitmapIndexscan,
  BitmapAnd, BitmapOr will be executed once and return a streamp bitmap or a
  hash bitmap. The BitmapHeapscan then calls tbm_iterate() to iterate
  through
  the bitmaps.
 
  The StreamBitmap structure will look like below.
 
  struct StreamBitmap {
  NodeTag type; /* to make it a valid Node */
  PagetableEntry entry; /* a page of tids in this stream bitmap */
 
  /* the iterator function */
  void (*next)(StreamBitmap*);
  Node* state; /* store how this stream bitmap generated,
  and all necessary information to
  obtain the next stream bitmap. */
  };

 I'd suggest making state just a (void *). It's private to the producer
 of the bitmap, and I don't see a reason to expose it. I assume that the
 next-function fills in the PageTableEntry with the next set of tids.

  Two new state objects will look like below. At the same time, we introduce
  three new node types: T_StreamBitmapAND, T_StreamBitmapOR,
  And T_StreamBitmapIndex, to define different states.
 
  /*
  * Stores the necessary information for iterating through the stream
  bitmaps
  * generated by nodeBitmapAnd or nodeBitmapOr.
  */
  struct StreamBitmapOp {
  NodeTag type; /* handles T_StreamBitmapAND and T_StreamBitmapOR */
  List* bitmaps;
  };

 AFAICS, this struct is private to tidbitmap.c, where the new
 stream-enabled tbm_intersect etc. functions are defined. Also, I don't
 see a reason why it needs to by a valid Node.

Well, making it a valid nodes makes it easy to identify (IsA) and gives us
access to copy/equal frameworks. I do think that it is best to bury this
in the bitmap code however.

  * Stores some necessary information for iterating through the stream
  * bitmaps generated by nodeBitmapIndexscan.
  */
  struct StreamBitmapIndex {
  NodeTag type; /* handle T_StreamBitmapIndex */
  IndexScanDesc scan;
  BlockNumber nextBlockNo;/* next block no to be read */
  };

 Where would this struct be defined? I think different index access
 methods might want to have different kind of states. The struct above
 assumes that the position of an index scan is always represented by the
 nextBlockNo. That seems to be the right thing for the bitmap indexam, so
 this struct is fine for StreamBitmaps returned by bmgetbitmap, but not
 necessary for others.

right.


  Then we will have the iterator functions like the following:
 
  ...
 
  void StreamBitmapIndexNext(StreamBitmap* node) {
  StreamBitmapIndex* sbi = (StreamBitmapIndex*) node-state;
  amgetbitmap(sbi-scan, NULL, sbi-nextBlockNo);
  }

 This means that the amgetbitmap function would still be called many
 times in each scan.  What would amgetbitmap return? A new StreamBitmap
 on each call?

 I'd like to see just one call to amgetbitmap. It would a) fill in the
 hash bitmap passed to it, b) return a new hash bitmap, or c) return a
 new StreamBitmap, with a indexam specific next-function that returns the
 tids one page at a time. I think we'll also need some kind of a
 destructor in StreamBitmap that's called by the consumer of the bitmap
 after it's done with it.

Right, I agree. I am working on this now.

Thanks,

gavin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Bitmap index status

2006-09-26 Thread Heikki Linnakangas
Looks a bit better now, though I think you need to think more about the 
encapsulation of the structs. More detailed comments below.


Jie Zhang wrote:

Essentially, we want to have a stream bitmap object that has an iterator,
which will be able to iterate through the bitmaps. The BitmapIndexscan,
BitmapAnd, BitmapOr will be executed once and return a streamp bitmap or a
hash bitmap. The BitmapHeapscan then calls tbm_iterate() to iterate 
through

the bitmaps.

The StreamBitmap structure will look like below.

struct StreamBitmap {
NodeTag type; /* to make it a valid Node */
PagetableEntry entry; /* a page of tids in this stream bitmap */

/* the iterator function */
void (*next)(StreamBitmap*);
Node* state; /* store how this stream bitmap generated,
and all necessary information to
obtain the next stream bitmap. */
};


I'd suggest making state just a (void *). It's private to the producer 
of the bitmap, and I don't see a reason to expose it. I assume that the 
next-function fills in the PageTableEntry with the next set of tids.



Two new state objects will look like below. At the same time, we introduce
three new node types: T_StreamBitmapAND, T_StreamBitmapOR,
And T_StreamBitmapIndex, to define different states.

/*
* Stores the necessary information for iterating through the stream 
bitmaps

* generated by nodeBitmapAnd or nodeBitmapOr.
*/
struct StreamBitmapOp {
NodeTag type; /* handles T_StreamBitmapAND and T_StreamBitmapOR */
List* bitmaps;
};


AFAICS, this struct is private to tidbitmap.c, where the new 
stream-enabled tbm_intersect etc. functions are defined. Also, I don't 
see a reason why it needs to by a valid Node.



/*
* Stores some necessary information for iterating through the stream
* bitmaps generated by nodeBitmapIndexscan.
*/
struct StreamBitmapIndex {
NodeTag type; /* handle T_StreamBitmapIndex */
IndexScanDesc scan;
BlockNumber nextBlockNo;/* next block no to be read */
};


Where would this struct be defined? I think different index access 
methods might want to have different kind of states. The struct above 
assumes that the position of an index scan is always represented by the 
nextBlockNo. That seems to be the right thing for the bitmap indexam, so 
this struct is fine for StreamBitmaps returned by bmgetbitmap, but not 
necessary for others.



Then we will have the iterator functions like the following:

...

void StreamBitmapIndexNext(StreamBitmap* node) {
StreamBitmapIndex* sbi = (StreamBitmapIndex*) node-state;
amgetbitmap(sbi-scan, NULL, sbi-nextBlockNo);
}


This means that the amgetbitmap function would still be called many 
times in each scan.  What would amgetbitmap return? A new StreamBitmap 
on each call?


I'd like to see just one call to amgetbitmap. It would a) fill in the 
hash bitmap passed to it, b) return a new hash bitmap, or c) return a 
new StreamBitmap, with a indexam specific next-function that returns the 
tids one page at a time. I think we'll also need some kind of a 
destructor in StreamBitmap that's called by the consumer of the bitmap 
after it's done with it.


--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Bitmap index status

2006-09-25 Thread Jie Zhang
Hi Mark,

Thanks for doing the test. I checked out the link you provided below. I am a
little confused about the goal of these tests. Do you plan to test the
overall performance of postgreSQL on handling TPC-H queries?

Thanks,
Jie

On 9/22/06 3:45 PM, Mark Wong [EMAIL PROTECTED] wrote:

 Jie Zhang wrote:
 Hi Heikki and all,
 
 I just sent the latest bitmap index patch to the list. I am not sure if
 there is any size limit for this mailing list. If you have received my
 previous email, please let me know.
 
 Hi Jie,
 
 I know I said I was going to get testing on this months ago but I've
 been juggling between 3 systems due to disk failures and other hardware
 configuration issues.  Anyways, I've take a baseline run of only the
 power test using a 1GB database with the patch 09-17 patch against a
 snapshot of pgsql from 2006-09-17:
 
 http://dbt.osdl.org/dbt/dbt3testing/results/dev8-007/2/
 
 Do you think the 1GB scale factor will be sufficient for testing as it
 will certainly be faster?  Do you think testing with just a power test
 will be sufficient for now?  I really don't have a good reason why I
 didn't run a throughput test other than to save time. :)  I also wanted
 to get your opinion again on which indexes we will want to try first.
 
 Thanks,
 Mark
 



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Bitmap index status

2006-09-25 Thread Mark Wong

Hi Jie,

Yeah, basically gather as many stats as I can to accurately profile the 
overall system performance.  I thought it would be appropriate to use a 
TPC-H based workload as one measuring stick to use for bitmap indexes.


Mark

Jie Zhang wrote:

Hi Mark,

Thanks for doing the test. I checked out the link you provided below. I am a
little confused about the goal of these tests. Do you plan to test the
overall performance of postgreSQL on handling TPC-H queries?

Thanks,
Jie

On 9/22/06 3:45 PM, Mark Wong [EMAIL PROTECTED] wrote:


Jie Zhang wrote:

Hi Heikki and all,

I just sent the latest bitmap index patch to the list. I am not sure if
there is any size limit for this mailing list. If you have received my
previous email, please let me know.

Hi Jie,

I know I said I was going to get testing on this months ago but I've
been juggling between 3 systems due to disk failures and other hardware
configuration issues.  Anyways, I've take a baseline run of only the
power test using a 1GB database with the patch 09-17 patch against a
snapshot of pgsql from 2006-09-17:

http://dbt.osdl.org/dbt/dbt3testing/results/dev8-007/2/

Do you think the 1GB scale factor will be sufficient for testing as it
will certainly be faster?  Do you think testing with just a power test
will be sufficient for now?  I really don't have a good reason why I
didn't run a throughput test other than to save time. :)  I also wanted
to get your opinion again on which indexes we will want to try first.

Thanks,
Mark



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Bitmap index status

2006-09-25 Thread Luke Lonergan
Mark,

On 9/25/06 11:32 AM, Mark Wong [EMAIL PROTECTED] wrote:

 Yeah, basically gather as many stats as I can to accurately profile the
 overall system performance.  I thought it would be appropriate to use a
 TPC-H based workload as one measuring stick to use for bitmap indexes.

Note that the TPC-H queries don't follow the typical good use case for
bitmap indexes.  You'd like to see queries that use multiple AND and OR
clauses, otherwise there may be no benefit.

Also, DBT-3/TPC-H on Postgres right now does not benefit from indices
overall.  The planner has limitations WRT selectivity estimates and other
limitations that cause it to choose index access poorly for the query
workload.  We have two new features coming (for 8.3) that fix this, but for
now we find that indexes are a net loss, in some queries a huge loss.

If you look at the whitepaper that Ayush Parashar published, he uses the
TPC-H data with some targeted queries that showcase the best use-cases for
bitmap index.

- Luke 



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Bitmap index status

2006-09-23 Thread Jie Zhang
Gavin  Heikki,

 
 The handling of stream and hash bitmaps looks pretty complicated to me.
 All the bitmap-related nodes have logic to handle both types slightly
 differently. It all seems to come down to that if a subnode (or
 amgetbitmap in a bitmap index scan node) returns a StreamBitmap, the
 caller needs to call the subnode many times, until it returns a NULL.
 With a HashBitmap, the caller only calls the subnode once.
 
 I think amgetbitmap should be called just once per index scan, and it
 should return either a hash bitmap or a stream bitmap. The same applies
 to all the executor nodes that return bitmaps, they would only return a
 single HashBitmap or StreamBitmap, and the upper node would call
 tbm_iterate repeatedly on that.
 
 StreamBitmap would contain a callback (filled by the indexam) that
 tbm_iterate would call to fill the next TBMIterateResult.
 
 Right, this was the approach taken by an earlier version of the patch I
 had worked on. It was significantly uglified by the need to keep the index
 state around and to be careful about what amrescan might do behind out
 back. I will, however, introduce the idea again because it makes the code
 much cleaner and more logical, as you seem to suggest.
 

I have been thinking about this approach some more. I do agree that this is
more attractive now. The following includes some more detailed design.
Please let me know if you have any comments. (My apologies to Gavin. You
talked to me about this approach before. But you introduced some on-disk
bitmap specific code into the tidbitmap.c, which prevented me from looking
more in this direction.)

Essentially, we want to have a stream bitmap object that has an iterator,
which will be able to iterate through the bitmaps. The BitmapIndexscan,
BitmapAnd, BitmapOr will be executed once and return a streamp bitmap or a
hash bitmap. The BitmapHeapscan then calls tbm_iterate() to iterate through
the bitmaps.

The StreamBitmap structure will look like below.

struct StreamBitmap {
NodeTag   type;   /* to make it a valid Node */
PagetableEntryentry;  /* a page of tids in this stream bitmap */

/* the iterator function */
void(*next)(StreamBitmap*);
Node*   state;/* store how this stream bitmap generated,
 and all necessary information to
 obtain the next stream bitmap. */
};

Two new state objects will look like below. At the same time, we introduce
three new node types: T_StreamBitmapAND, T_StreamBitmapOR,
And T_StreamBitmapIndex, to define different states.

/*
 * Stores the necessary information for iterating through the stream bitmaps
 * generated by nodeBitmapAnd or nodeBitmapOr.
 */
struct StreamBitmapOp {
NodeTag type;  /* handles T_StreamBitmapAND and T_StreamBitmapOR */
List*   bitmaps;
};

/*
 * Stores some necessary information for iterating through the stream
 * bitmaps generated by nodeBitmapIndexscan.
 */
struct StreamBitmapIndex {
NodeTag type; /* handle T_StreamBitmapIndex */
IndexScanDescscan;
BlockNumbernextBlockNo;/* next block no to be read */
};

Then we will have the iterator functions like the following:

void StreamBitmapAndNext(StreamBitmap* node) {
  tbm_intersect_stream(((StreampBitmapOp*) node-state)-bitmaps, node);
}

void StreamBitmapOrNext(StreamBitmap* node) {
  tbm_union_stream(((StreampBitmapOp*) node-state)-bitmaps, node);
}

void StreamBitmapIndexNext(StreamBitmap* node) {
  StreamBitmapIndex* sbi = (StreamBitmapIndex*) node-state;
  amgetbitmap(sbi-scan, NULL, sbi-nextBlockNo);
}

What do you think?

Thanks,
Jie



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Bitmap index status

2006-09-22 Thread Mark Wong

Jie Zhang wrote:

Hi Heikki and all,

I just sent the latest bitmap index patch to the list. I am not sure if
there is any size limit for this mailing list. If you have received my
previous email, please let me know.


Hi Jie,

I know I said I was going to get testing on this months ago but I've 
been juggling between 3 systems due to disk failures and other hardware 
configuration issues.  Anyways, I've take a baseline run of only the 
power test using a 1GB database with the patch 09-17 patch against a 
snapshot of pgsql from 2006-09-17:


http://dbt.osdl.org/dbt/dbt3testing/results/dev8-007/2/

Do you think the 1GB scale factor will be sufficient for testing as it 
will certainly be faster?  Do you think testing with just a power test 
will be sufficient for now?  I really don't have a good reason why I 
didn't run a throughput test other than to save time. :)  I also wanted 
to get your opinion again on which indexes we will want to try first.


Thanks,
Mark

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Bitmap index status

2006-09-20 Thread Jie Zhang

On 9/19/06 5:15 AM, Gavin Sherry [EMAIL PROTECTED] wrote:

 On Tue, 19 Sep 2006, Heikki Linnakangas wrote:
 
 Jie Zhang wrote:
 Hi Heikki and all,
 
 Please find the latest bitmap index patch in the attachment. This patch is
 generated against the postgresql cvs head.
 
 
 Thanks.
 
 The handling of stream and hash bitmaps looks pretty complicated to me.
 All the bitmap-related nodes have logic to handle both types slightly
 differently. It all seems to come down to that if a subnode (or
 amgetbitmap in a bitmap index scan node) returns a StreamBitmap, the
 caller needs to call the subnode many times, until it returns a NULL.
 With a HashBitmap, the caller only calls the subnode once.
 
 I think amgetbitmap should be called just once per index scan, and it
 should return either a hash bitmap or a stream bitmap. The same applies
 to all the executor nodes that return bitmaps, they would only return a
 single HashBitmap or StreamBitmap, and the upper node would call
 tbm_iterate repeatedly on that.
 
 StreamBitmap would contain a callback (filled by the indexam) that
 tbm_iterate would call to fill the next TBMIterateResult.
 
 Right, this was the approach taken by an earlier version of the patch I
 had worked on. It was significantly uglified by the need to keep the index
 state around and to be careful about what amrescan might do behind out
 back. I will, however, introduce the idea again because it makes the code
 much cleaner and more logical, as you seem to suggest.

I will think about this approach. But I am not quite convinced that this
approach will be simpler and cleaner than the above approach. :-)

Thanks,
Jie



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Bitmap index status

2006-09-19 Thread Heikki Linnakangas

Jie Zhang wrote:

Hi Heikki and all,

Please find the latest bitmap index patch in the attachment. This patch is
generated against the postgresql cvs head.
  


Thanks.

The handling of stream and hash bitmaps looks pretty complicated to me. 
All the bitmap-related nodes have logic to handle both types slightly 
differently. It all seems to come down to that if a subnode (or 
amgetbitmap in a bitmap index scan node) returns a StreamBitmap, the 
caller needs to call the subnode many times, until it returns a NULL. 
With a HashBitmap, the caller only calls the subnode once.


I think amgetbitmap should be called just once per index scan, and it 
should return either a hash bitmap or a stream bitmap. The same applies 
to all the executor nodes that return bitmaps, they would only return a 
single HashBitmap or StreamBitmap, and the upper node would call 
tbm_iterate repeatedly on that.


StreamBitmap would contain a callback (filled by the indexam) that 
tbm_iterate would call to fill the next TBMIterateResult.


I would also move the code to AND and OR together different types of 
bitmaps to tidbitmap.c, so that BitmapAnd and BitmapOr nodes wouldn't 
need to care about different types either.


tbm_intersect and tbm_union with two HashBitmaps would work like they do 
now. Calling tbm_intersect or tbm_union with two StreamBitmaps would 
return a new StreamBitmap object that would pull results from the two 
argument StreamBitmaps page at a time and AND or OR them together. 
Combining a HashBitmap and a StreamBitmap would wrap the HashBitmap with 
a new StreamBitmap that pulls the entries from the HashBitmap page at a 
time.


What do you think? Would you like me to help implementing the above?


This patch includes code style changes, bug fixes, and the stream bitmap
implementation. I have also fixed the problems mentioned in Heikki's email.
  


It seems you fixed the race condition in inserttuple by holding a write 
lock on the metapage while you find/create the lov item. While correct, 
isn't that pretty bad for concurrency? I realize that the main use case 
for bitmap indexes is large data warehouses where you don't do 
concurrent updates, but still...


--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Bitmap index status

2006-09-19 Thread Gavin Sherry
On Tue, 19 Sep 2006, Heikki Linnakangas wrote:

 Jie Zhang wrote:
  Hi Heikki and all,
 
  Please find the latest bitmap index patch in the attachment. This patch is
  generated against the postgresql cvs head.
 

 Thanks.

 The handling of stream and hash bitmaps looks pretty complicated to me.
 All the bitmap-related nodes have logic to handle both types slightly
 differently. It all seems to come down to that if a subnode (or
 amgetbitmap in a bitmap index scan node) returns a StreamBitmap, the
 caller needs to call the subnode many times, until it returns a NULL.
 With a HashBitmap, the caller only calls the subnode once.

 I think amgetbitmap should be called just once per index scan, and it
 should return either a hash bitmap or a stream bitmap. The same applies
 to all the executor nodes that return bitmaps, they would only return a
 single HashBitmap or StreamBitmap, and the upper node would call
 tbm_iterate repeatedly on that.

 StreamBitmap would contain a callback (filled by the indexam) that
 tbm_iterate would call to fill the next TBMIterateResult.

Right, this was the approach taken by an earlier version of the patch I
had worked on. It was significantly uglified by the need to keep the index
state around and to be careful about what amrescan might do behind out
back. I will, however, introduce the idea again because it makes the code
much cleaner and more logical, as you seem to suggest.

Thanks,

Gavin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Bitmap index status

2006-09-17 Thread Jie Zhang
Hi Heikki and all,

I just sent the latest bitmap index patch to the list. I am not sure if
there is any size limit for this mailing list. If you have received my
previous email, please let me know.

Thanks,
Jie

On 9/12/06 2:43 AM, Heikki Linnakangas [EMAIL PROTECTED] wrote:

 Hi,
 
 What's the status of the bitmap index patch? Have you worked on it since
 the last posted patch
 (http://archives.postgresql.org/pgsql-patches/2006-08/msg3.php)?
 
 I've started to review it, to get it into CVS early in the 8.3 cycle. I
 just want to make sure that I'm working on the latest version.
 
 Beside the issues already discussed, I found two minor bugs:
 * pg_am says that bitmap am supports unique indexes, while it doesn't.
 Second,
 * race condition in _bitmap_inserttuple if two backends try to insert
 the same, new value. If they both find that there's no lov item for the
 key, and try to create one, one backend will get a duplicate key error
 on the lov index.
 
 Also, vacuum actually does a reindex, which seems awfully wasteful. That
 needs to be looked at.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Bitmap index status

2006-09-12 Thread Heikki Linnakangas

Hi,

What's the status of the bitmap index patch? Have you worked on it since 
the last posted patch 
(http://archives.postgresql.org/pgsql-patches/2006-08/msg3.php)?


I've started to review it, to get it into CVS early in the 8.3 cycle. I 
just want to make sure that I'm working on the latest version.


Beside the issues already discussed, I found two minor bugs:
* pg_am says that bitmap am supports unique indexes, while it doesn't. 
Second,
* race condition in _bitmap_inserttuple if two backends try to insert 
the same, new value. If they both find that there's no lov item for the 
key, and try to create one, one backend will get a duplicate key error 
on the lov index.


Also, vacuum actually does a reindex, which seems awfully wasteful. That 
needs to be looked at.


--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Bitmap index status

2006-09-12 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 What's the status of the bitmap index patch? Have you worked on it since 
 the last posted patch 
 (http://archives.postgresql.org/pgsql-patches/2006-08/msg3.php)?

Gavin and Jie have made major changes since that version (or at least
they'd better have something to show for the month since then ;-)).
I wouldn't recommend reviewing the patch until they post something
current ...

 Also, vacuum actually does a reindex, which seems awfully wasteful. That 
 needs to be looked at.

Yikes.  I imagine they've not tried to do anything about that; if you
want to help, maybe you could take that subproblem?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Bitmap index status

2006-09-12 Thread Jie Zhang
Hi Heikki,

Gavin and I are trying to merge our changes together this week. We will post
a new patch by the end of this week. This patch will include some style
fixes, bug fixes, and the stream bitmap implementation.

I will look into the problems you have mentioned in this email. Yes, vacuum
currently does a reindex now. Gavin and I just talked about this yesterday.
We are looking into ways to improve this. One way is not to do reindex for
each vacuum. We maintain a list of updated tids along with the bitmap index.
Only when this list goes to a certain point, vacuum will re-build the index.

Thanks,
Jie

On 9/12/06 2:43 AM, Heikki Linnakangas [EMAIL PROTECTED] wrote:

 Hi,
 
 What's the status of the bitmap index patch? Have you worked on it since
 the last posted patch
 (http://archives.postgresql.org/pgsql-patches/2006-08/msg3.php)?
 
 I've started to review it, to get it into CVS early in the 8.3 cycle. I
 just want to make sure that I'm working on the latest version.
 
 Beside the issues already discussed, I found two minor bugs:
 * pg_am says that bitmap am supports unique indexes, while it doesn't.
 Second,
 * race condition in _bitmap_inserttuple if two backends try to insert
 the same, new value. If they both find that there's no lov item for the
 key, and try to create one, one backend will get a duplicate key error
 on the lov index.
 
 Also, vacuum actually does a reindex, which seems awfully wasteful. That
 needs to be looked at.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq