[ 
https://issues.apache.org/jira/browse/MAHOUT-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040977#comment-13040977
 ] 

Lance Norskog edited comment on MAHOUT-676 at 5/30/11 2:43 AM:
---------------------------------------------------------------

bq. Normally slice samplers are used in the sense that Radford Neal proposed in 
his 2003 (I think) paper. The Wikipedia entry is my base: [Slice 
Sampling|http://en.wikipedia.org/wiki/Slice_sampling] and yes, it's Neal 2003.

Yes, the normal use of slice sampling is to efficiently find a set of samples 
corresponding to the PDF/area under curve. That would also be a useful 
implementation. 

bq. you can use bisection until you get a unique result.
This patch's Sampler interface gives one decision per call. The stupid 
implementation here seems the cleanest.
I have a bisection implementation using the windowing algorithm in the wiki 
page.
bq. What is the need being satisfied here?
See the description.




      was (Author: lancenorskog):
    bq. Normally slice samplers are used in the sense that Radford Neal 
proposed in his 2003 (I think) paper.
he Wikipedia entry is my base: [Slice 
Sampling|http://en.wikipedia.org/wiki/Slice_sampling] and yes, it's Neal 2003.

Yes, the normal use of slice sampling is to efficiently find a set of samples 
corresponding to the PDF/area under curve. That would also be a useful 
implementation. 

bq. you can use bisection until you get a unique result.
I have a bisection implementation using the windowing algorithm in the wiki 
page. Bu

This patch's Sampler interface gives one decision per call. The stupid 
implementation here seems the cleanest.





  
> Random samplers in a modular library
> ------------------------------------
>
>                 Key: MAHOUT-676
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-676
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Math
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-676.patch, Sampler.patch
>
>
> This is a modular suite of samplers. It supplies the ability to throw away 
> samples in a useful way. 
> Here is a use case: for my recommendations, I want user activity to decide 
> the amount of influence on the results. For the number of users who watch X 
> number of movies: 1-5 is 20%, 6-15 is 50%, 15-30 is 30 %, and users who watch 
> over 30 movies are not useful.
> * If I know the input distribution, I can supply a function to the Slice 
> sampler to give this distribution. 
> * If I don't know the distribution, I can create a Reservoir sampler for each 
> of the three buckets. After reading the whole set, I check the sizes of the 
> various buckets and solve for my distribution. This gives the number of users 
> to pull from each bucket.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to