Josh Berkus <josh@agliodbs.com> writes: > One particular compelling situation for on-disk bitmaps is for terabyte > tables where a btree index would not fit into memory. Index > performance for an index which is 10x or more the size of RAM really > sucks ... I can come up with some test results if you doubt that.
Sure... > Also note that "low cardinality" is relative. For a 1 billion row > table, a column with 10,000 values is "low-cardinality", having around > 100,000 rows per value ... but that's still 0.01% of the table per > value, making index use still applicable. And your point is? Assuming a reasonable datatype like int4, a btree index on this table would occupy say 20GB (16 bytes per entry plus fillfactor overhead). The bitmap version would require 10,000 bitmaps of 1G bits apiece, or 1250GB (not even counting overhead). You need some wildly optimistic assumptions about the compressibility of the bitmaps to get even within hailing distance of the btree, let alone fit in RAM. A realistic assumption for the numbers you mention is that the bitmaps have 1-bits about 100 bits apart, which means you could get maybe 3-to-1 compression using the runlength scheme that's in there ... leaving the bitmap a factor of 20 bigger than the btree. AFAICS "low cardinality" has to mean just that, a few dozen distinct values at most, for this scheme to have any hope. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq