[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2022-04-22 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15510:
---
  Fix Version/s: 4.0.5
 4.1
 (was: 4.x)
 (was: 4.0.x)
Source Control Link: 
https://github.com/apache/cassandra/commit/018c8e0d5e8bc55fc51d3361fcb27c3c1fd189f6
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed into cassandra-4.0 at 018c8e0d5e8bc55fc51d3361fcb27c3c1fd189f6 and 
merged into trunk

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.5, 4.1
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2022-04-21 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15510:
---
Status: Ready to Commit  (was: Review In Progress)

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2022-03-07 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-15510:

Reviewers: Benjamin Lerer, Branimir Lambov  (was: Benjamin Lerer)
   Status: Review In Progress  (was: Patch Available)

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2022-02-19 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15510:
---
Status: Patch Available  (was: In Progress)

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} transformations to permit efficient follow-ups.



--
This message was sent 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2022-01-24 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15510:
---
Status: In Progress  (was: Patch Available)

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} transformations to permit efficient follow-ups.



--
This message was sent 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2022-01-24 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15510:
---
Status: Patch Available  (was: Review In Progress)

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} transformations to permit efficient follow-ups.



--
This message 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2021-12-06 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15510:
---
Status: Review In Progress  (was: Patch Available)

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} transformations to permit efficient follow-ups.



--
This message 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2021-12-03 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15510:
---
Reviewers: Benjamin Lerer

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} transformations to permit efficient follow-ups.



--
This message was sent by Atlassian Jira

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2021-12-01 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-15510:
---
Fix Version/s: 4.0.x

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} transformations to permit efficient follow-ups.



--
This message was sent by Atlassian 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2020-01-20 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15510:
---
Fix Version/s: 4.x

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.x
>
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} transformations to permit efficient follow-ups.



--
This message was sent by Atlassian 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2020-01-16 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15510:
---
Test and Documentation Plan: Improvements to LongBTreeTest included, some 
more to come.  Several JMH benchmarks included.
 Status: Patch Available  (was: Open)

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2020-01-16 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15510:
---
Change Category: Performance
 Complexity: Challenging
 Status: Open  (was: Triage Needed)

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} transformations to permit 

[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2020-01-16 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15510:
---
Component/s: Local/Other

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} transformations to permit efficient follow-ups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)