Bitmap" by Marqu isWang

Apache Wiki Fri, 19 Nov 2010 18:44:41 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "Hive/IndexDev/Bitmap" page has been changed by MarquisWang.
http://wiki.apache.org/hadoop/Hive/IndexDev/Bitmap?action=diff&rev1=1&rev2=2

--------------------------------------------------

  
  The basic implementation's only compression is eliminating blocks where all 
rows are 0s. This is unlikely to happen for larger blocks, so we need a better 
compression format. What we can do is do byte-aligned bitmap compression, where 
the bitmap is an array of bytes, and a byte of all 1s or all 0s implies one or 
more bytes where every value is 0 or 1. Then, we would just need to add another 
column in the bitmap index table that is an array of Ints that describe how 
long the gaps are and logic to expand the compression.
  
+ == Example ==
+ 
+ Suppose we have a bitmap index on a key where, on the first block, value "a" 
appears in rows 5, 12, and 64, and value "b" appears in rows 7, 8, and 9. Then, 
for the preliminary implementation, the first entry in the index table will be:
+ 
+ 
{{https://issues.apache.org/jira/secure/attachment/12460083/bitmap_index_1.png}}
+ 
+ The values in the array represent the bitmap for each block, where each 
32-bit BigInt value stores 32 rows.
+ 
+ For the second iteration, the first entry will be:
+ 
+ 
{{https://issues.apache.org/jira/secure/attachment/12460083/bitmap_index_2.png}}
+ 
+ This one uses 1-byte array entries, so each value in the array stores 8 rows. 
If an entry is 0x00 or 0xFF, it represents 1 or more consecutive bytes of 
zeros, (in this case 5 and 4, respectively)
+

[Hadoop Wiki] Update of "Hive/IndexDev/Bitmap" by Marqu isWang

Reply via email to