Chengxiang Li created FLINK-2533:
------------------------------------
Summary: Gap based random sample optimization
Key: FLINK-2533
URL: https://issues.apache.org/jira/browse/FLINK-2533
Project: Flink
Issue Type: Improvement
Components: Core
Reporter: Chengxiang Li
Priority: Minor
For random sampler with fraction, like BernoulliSampler and PoissonSampler, Gap
based random sampler could exploit O(k) sample implementation instead of
previous O\(n\) sample implementation, it should perform better while sample
fraction is very small. [This
blog|http://erikerlandson.github.io/blog/2014/09/11/faster-random-samples-with-gap-sampling/]
describes more detail about gap based random sampler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)