confusion about " the primary key intervals of different RowSets may intersect"

曾杰南 Sat, 19 Mar 2016 03:43:04 -0700

Hi all:
I learn Kudu's paper "Kudu: Storage for Fast Analytics on Fast Data" very hard 
to find
why performance of hbase' random query is superior to kudu. "the primary key 
intervals
 of different RowSets may intersect" may be one of the reasons.


My confusion is why not keep DiskRowSets ordered on primary key globally. When 
flush MemRowSet,
the rows of MemRowSet dispatch to deltaMemStore of correspanding DiskRowSets. 
And negative side
effects is fragment of DiskRowSets, but it is worth for globally orderd of 
DiskRowSets.

best
jie

confusion about " the primary key intervals of different RowSets may intersect"

Reply via email to