[ https://issues.apache.org/jira/browse/FLINK-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
GaoLun updated FLINK-2535: -------------------------- Attachment: sampling.png Statistical data of rejected items' number with SRS & SSRS. > Fixed size sample algorithm optimization > ---------------------------------------- > > Key: FLINK-2535 > URL: https://issues.apache.org/jira/browse/FLINK-2535 > Project: Flink > Issue Type: Improvement > Components: Core > Reporter: Chengxiang Li > Priority: Minor > Attachments: sampling.png > > > Fixed size sample algorithm is known to be less efficient than sample > algorithms with fraction, but sometime it's necessary. Some optimization > could significantly reduce the storage size and computation cost, such as the > algorithm described in [this > paper|http://machinelearning.wustl.edu/mlpapers/papers/icml2013_meng13a]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)