leerho commented on code in PR #164: URL: https://github.com/apache/datasketches-website/pull/164#discussion_r1511613587
########## docs/Theta/ThetaUpdateSpeed.md: ########## @@ -22,23 +22,23 @@ layout: doc_page ## Theta Family Update Speed ### Resize Factor = X1 -The following graph illustrates the update speed of 3 different sketches from the library: the Heap QuickSelect Sketch, the Off-Heap QuickSelect Sketch, and the Heap Alpha Sketch. -The X-axis is the number of unique values presented to a sketch. The Y-axis is the average time to perform an update. It is computed as the total time to update X-uniques divided by X-uniques. +The following graph illustrates the update speed of 3 different sketches from the library: the Heap QuickSelect (QS) Sketch, the Off-Heap QuickSelect (QS) Sketch, and the Heap Alpha Sketch. +The X-axis is the number of unique values presented to a sketch. The Y-axis is the average time to perform an update. It is computed as the total time to update X-uniques, divided by X-uniques. -The high values on the left are due to Java overhead and JVM warmup. The humps in the middle of the graph are due to the internal hash table filling up and forcing an internal rebuild and reducing theta. For this plot the sketches were configured with <i>k</i> = 4096. -The sawtooth peaks on the QS plots represent successive reqbuilds. The downward slope on the right side of the hump is the sketch speeding up because it is rejecting more and more incoming hash values due to the continued reduction in the value of theta. -The Alpha sketch (in red) uses a more advanced hash table update algorithm that defers the first rebuild until after theta has started decreasing. This is the little spike just to the right of the hump. +The high values on the left are due to Java overhead and JVM warmup. The spikes starting at about 4K uniques are due to the internal hash-table filling up and forcing an internal hash-table rebuild, which also reduces theta. For this plot the sketches were configured with <i>k</i> = 4096. +The sawtooth peaks on the QuickSelect curves represent successive rebuilds. The downward slope on the right side of the largest spike is the sketch speeding up because it is rejecting more and more incoming hash values due to the continued reduction in the value of theta. +The Alpha sketch (in red) uses a more advanced hash-table update algorithm that defers the first rebuild until after theta has started decreasing. This is the little spike just to the right of the maximum of the curve. Review Comment: fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
