leerho commented on PR #62: URL: https://github.com/apache/datasketches-rust/pull/62#issuecomment-3770275536
There is some documentation, but it is still sparse. From the [datasketches website](https://datasketches.apache.org), you can find under Code Docs [CPP 5.2.0](https://apache.github.io/datasketches-cpp/5.2.0/) then [Namespaces/DensitySketch](https://apache.github.io/datasketches-cpp/5.2.0/classdatasketches_1_1density__sketch.html). Then scroll down to the Detailed Description. I have not used this sketch, but in many scientific areas it is important to understand the density at a point in an large algebraic field. This sketch is constructed with a value _k_, which controls the overall size and accuracy of the sketch, _dim_, the number of spacial dimensions, a _Kernel_ function, which does the work of identifying the density at a point in _dim-space_ within a _dim-sphere_ whose radial sensitivity is controlled by the _Kernal_ density function (A common one is Gaussian, but user can supply others), and finally an _Allocator_. The input is a stream of floating-point vectors of size _dim_. A query is basically a _get_estimate(point)_ where the _point_ is a _dim_vector_. Once the sketch reaches size k, it starts summarizing the field into a _core-set_ of points which approximates the most dense points in the field. I would study some of the test cases for some simple examples and read the [Karnin & Liberty](https://proceedings.mlr.press/v99/karnin19a/karnin19a.pdf) paper. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
