On 7/24/06 6:04 PM, "Bruce Momjian" <[EMAIL PROTECTED]> wrote:
> Jie Zhang wrote: >>> IIRC they quoted the cardinality of 10000 as something that is still >>> faster than btree for several usecases. >>> >>> And also for AND-s of several indexes, where indexes are BIG, your btree >>> indexes may be almost as big as tables but the resulting set of pages is >>> small. >> >> Yeah, Hannu points it out very well -- the bitmap index works very well when >> columns have low cardinalities and AND operations will produce small number >> of results. > > What operations on columns of low cardinality produce a small number of > results? That seems contradictory. Let's see an example. Table 'T' includes two columns, 'p' and 's'. The column 'p' has 100 distinct values, say p1-p100 and the column 'status' has 20 distinct values, say s1-s20. The query 'select * from order where priority=p1 and status=s1' may produce small number of results. Also, if these related rows are clustered together, that would be even better. > >> Also, the bitmap index is very small in low cardinality cases, where the >> btree tends to take up at least 10 times more space. > > Also, are adding/changing rows is more expensive with bitmaps than > btrees? Inserting a row will only affect the last word (at most last several words) of a bitmap vector, so this should not be very expensive: 3-4 IOs. When a row is updated and the new row is inserted in the middle of the heap, currently the code will update the bit in the place -- where the bit should be. Searching for the page which includes the bit to be updated is not very efficient now, but this can be fixed. Currently, we have to scan the pages for a bitmap vector one by one until we hit the right page. Since the bitmap vector is compressed, updating a bit in the middle may cause its page to overflow. In this case, we create a new page to accommodate those extra bits, and insert this new page right after the original page. Overall, inserting a row or updating a row can be done efficiently. But it is true that the bitmap index does not perform well if there are lots of inserts and updates, especially updates. ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match