Following up to our conversations, It seems a good way to do that is to optimize the current index in the trailer, which is currently inserted into a map when the cellstore is opened. Using this approach would obviate the need to use a map, which would make it faster and use less memory. That way people can trade of index overhead by changing the block size of a cellstore.
__Luke On Mar 4, 11:19 am, Sanjit Jhala <[email protected]> wrote: > Hi, > > In the current CellStore, searching for a particular key is a somewhat > expensive operation. > After the scanner uses an index to locate the block the key could be > in, a linear scan of the block needs to be performed to locate the key. > > If the block were to also store the number of keys and offsets to the > positions of the keys, one could locate the keys much faster > exploiting the fact that they are stored in sorted order. The > additional storage cost could be reduced by storing delta offsets and > compressing them or storing offsets for a fraction of the keys. > > -Sanjit --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en -~----------~----~----~----~------~----~------~--~---
