Thanks ABdullah for clarifying this. I read about LSM filters in Sattam's paper, now looking at the code.
Thanks and regards Veeral Shah On Mon, Mar 7, 2016 at 7:58 PM, abdullah alamoudi <[email protected]> wrote: > Hi Veeral, > Two points: > 1. Keys in our LSM index are not mutually exclusive between different > components. > 2. We have the concept of filtered LSM index in which we store (min-max) of > a field(not necessarily the key) and we can use them then when answering > queries to know which components to look at. > > Hope this helps, > Abdullah. > > On Mon, Mar 7, 2016 at 4:48 PM, Veeral Shah <[email protected]> wrote: > > > Had a small question around asterixdb LSMified secondry index > structure > > (Disclaimer : I dont have complete familiarity with the LSM file formats > in > > asterixdb - hence below also sort of validates my understanding), > > > > In a classic LSM implementation, files (aka SST files in leveldb) at > level > > L1 onwards (L1, L2, L3,...), are sorted based on keys (assuming key types > > where linear order can be imposed) and the key ranges of any two SST > files, > > in a level, are mutually exclusive. Since the LSM files are immutable, a > > file's key range at a level remains fixed (unless compaction happens). So > > taking a case of a asterixdb LSMified secondary index, assuming the keys > > are byte strings, all keys should be alphabetically sorted. > > > > If above is true, there may be a merit in creating an additional > in-memory > > "coarse-index" (for each level) which keeps info about key.max and > key.min > > for each file region (say an SST file is broken into multiple smaller > > regions of fixed size). This would help in quickly figuring out where a > key > > could fall, in a given LSM level. (The coarse-index is to be > > created/recreated during startup and compaction) > > > > Thanks and regards > > Veeral Shah > > >
