[ 
https://issues.apache.org/jira/browse/FLINK-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GaoLun updated FLINK-2535:
--------------------------
    Attachment: sampling.png

Statistical data of rejected items' number with SRS & SSRS.

> Fixed size sample algorithm optimization
> ----------------------------------------
>
>                 Key: FLINK-2535
>                 URL: https://issues.apache.org/jira/browse/FLINK-2535
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chengxiang Li
>            Priority: Minor
>         Attachments: sampling.png
>
>
> Fixed size sample algorithm is known to be less efficient than sample 
> algorithms with fraction, but sometime it's necessary. Some optimization 
> could significantly reduce the storage size and computation cost, such as the 
> algorithm described in [this 
> paper|http://machinelearning.wustl.edu/mlpapers/papers/icml2013_meng13a].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to