Re: Longitude Index: B-Tree or Cluster

John J Foster via 4D_Tech Wed, 15 Apr 2020 07:07:30 -0700

Hey Tim,

> Many years ago at a 4D Summit I remember...


Do you have a photographic memory? <smile>

Good info.

Thanks,
John...

> Many years ago at a 4D Summit I remember someone asking Laurent a question of 
> how Cluster B-Tree indexes were implemented. He gave a fairly detailed 
> answer. 
> 
> Cluster B-Tree indexes are a B-Tree index, but there is only 1 node for each 
> index value. The data in the node can be several different things. 
> 
> - It could be a reference to a single record like you would have in a 
> traditional B-Tree index. It’s just a record number. 
> 
> - If the node is for 2 or more records, an array of record numbers is stored 
> in the node. He made a point of saying it is the same ARRAY LONGINT just like 
> we have in the language, so 4 bytes per record.
> 
> - At some point — he didn’t go into when it would happen — 4D switches from 
> an array of record numbers to a set of records. He said it’s the same type of 
> 4D set that we have in the language where you have 1 bit representing each 
> record that is included in the set, and all the size optimizations they do 
> for sets using compression and whatever. 
> 
> The idea is to try and balance speed of converting that list of records — 
> whether it is an array of record numbers or a set — and the amount of storage 
> needed for the node.
> 
> So when you do a typical query using a Cluster B-Tree you get a single node. 
> If the node is for 1 record is is just as fast to retrieve the record as with 
> a normal B-Tree index. If the node contains an array of 10 records numbers, 
> then the array is used to retrieve the 10 records and create a selection. 
> 
> But if the node contains a lot of record — who know exactly how many “a lot” 
> is — then a set is used to create the selection of records. And that is very 
> fast and efficient. That’s why a Cluster B-Tree index on a boolean field is 
> always super fast for 10,000 records or 1,000,000 records. Using sets makes 
> it very fast to build that selection as compared to using a normal B-Tree 
> index where you would have to walk 10,000 nodes or 1,000,000 nodes of the 
> tree to build of the selection. 
> 
> Also Cluster B-Tree indexes reduces the need to rebalance the tree as new 
> records are added to the table since it usually just adds data to an existing 
> node rather than always adding a new node for each new record.

**********************************************************************
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**********************************************************************

Re: Longitude Index: B-Tree or Cluster

Reply via email to