Hey Tim, > Many years ago at a 4D Summit I remember...
Do you have a photographic memory? <smile> Good info. Thanks, John... > Many years ago at a 4D Summit I remember someone asking Laurent a question of > how Cluster B-Tree indexes were implemented. He gave a fairly detailed > answer. > > Cluster B-Tree indexes are a B-Tree index, but there is only 1 node for each > index value. The data in the node can be several different things. > > - It could be a reference to a single record like you would have in a > traditional B-Tree index. It’s just a record number. > > - If the node is for 2 or more records, an array of record numbers is stored > in the node. He made a point of saying it is the same ARRAY LONGINT just like > we have in the language, so 4 bytes per record. > > - At some point — he didn’t go into when it would happen — 4D switches from > an array of record numbers to a set of records. He said it’s the same type of > 4D set that we have in the language where you have 1 bit representing each > record that is included in the set, and all the size optimizations they do > for sets using compression and whatever. > > The idea is to try and balance speed of converting that list of records — > whether it is an array of record numbers or a set — and the amount of storage > needed for the node. > > So when you do a typical query using a Cluster B-Tree you get a single node. > If the node is for 1 record is is just as fast to retrieve the record as with > a normal B-Tree index. If the node contains an array of 10 records numbers, > then the array is used to retrieve the 10 records and create a selection. > > But if the node contains a lot of record — who know exactly how many “a lot” > is — then a set is used to create the selection of records. And that is very > fast and efficient. That’s why a Cluster B-Tree index on a boolean field is > always super fast for 10,000 records or 1,000,000 records. Using sets makes > it very fast to build that selection as compared to using a normal B-Tree > index where you would have to walk 10,000 nodes or 1,000,000 nodes of the > tree to build of the selection. > > Also Cluster B-Tree indexes reduces the need to rebalance the tree as new > records are added to the table since it usually just adds data to an existing > node rather than always adding a new node for each new record. ********************************************************************** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **********************************************************************