[ 
https://issues.apache.org/jira/browse/DATAFU-21?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jian wang updated DATAFU-21:
----------------------------

    Description: 
This issue is used to track investigation on finding a weighted sampler without 
using internal reservoir. 

At present, the SimpleRandomSample has implemented a good acceptance-rejection 
sampling algo on probability random sampling. The weighted sampler could 
utilize the simple random sample with slight modification.

One slight modification is:  the present simple random sample generates a 
uniform random number lies between (0, 1) as the random variable to accept or 
reject an item. The weighted sample may generate this random variable based on 
the item's weight and this random number still lies between (0, 1) and each 
item's random variable remain independent between each other.

Need further think and experiment the correctness of this solution and how to 
implement it in an effective way.

  was:
This issue is used to track investigation on finding a weighted sampler without 
using internal reservoir. 

At present, the SimpleRandomSample has implemented a good acceptance-rejection 
sampling algo on probability random sampling. The weighted sampler could 
utilize the simple random sample with slight modification.

One slight modification is:  the present simple random sample generates a 
uniform random number lies between (0, 1) as the random variable to accept or 
reject an item. The weighted sample may generate this random variable based on 
the item's weight and this random number still lies between (0, 1) and each 
item's random variable remain independent between each other.

Need further think the correctness of this solution and how to implement it in 
an effective way.


> Probability weighted sampling without reservoir
> -----------------------------------------------
>
>                 Key: DATAFU-21
>                 URL: https://issues.apache.org/jira/browse/DATAFU-21
>             Project: DataFu
>          Issue Type: New Feature
>         Environment: Mac OS, Linux
>            Reporter: jian wang
>
> This issue is used to track investigation on finding a weighted sampler 
> without using internal reservoir. 
> At present, the SimpleRandomSample has implemented a good 
> acceptance-rejection sampling algo on probability random sampling. The 
> weighted sampler could utilize the simple random sample with slight 
> modification.
> One slight modification is:  the present simple random sample generates a 
> uniform random number lies between (0, 1) as the random variable to accept or 
> reject an item. The weighted sample may generate this random variable based 
> on the item's weight and this random number still lies between (0, 1) and 
> each item's random variable remain independent between each other.
> Need further think and experiment the correctness of this solution and how to 
> implement it in an effective way.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to