[
https://issues.apache.org/jira/browse/DATAFU-16?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883575#comment-13883575
]
Matthew Hayes commented on DATAFU-16:
-------------------------------------
Thanks for running the experiment Jian! I expected there might be an issue
with the "weighted reservoir sampling exponential jump algebraic" case. I
think that the exponential jump method only works on an accumulate-based model.
For algebraic, the usage of a combiner probably breaks the assumptions behind
this approach.
> weighted reservoir sampling with exponential jumps UDF
> ------------------------------------------------------
>
> Key: DATAFU-16
> URL: https://issues.apache.org/jira/browse/DATAFU-16
> Project: DataFu
> Issue Type: New Feature
> Environment: Mac, Linux
> pig-0.11
> Reporter: jian wang
> Priority: Minor
> Attachments: ScoredExpJmpReservoir.java, ScoredReservoir.java,
> WeightedSamplingCorrectnessTests.java
>
>
> Create a weightedReservoirSampleWithExpJump UDF to implement the weighted
> reservoir sampling algorithm with exponential jumps. Investigation is tracked
> in https://github.com/linkedin/datafu/issues/80. This task is part of
> experiment of different weighted sampling algorithms.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)