Frank McQuillan created MADLIB-1156:
---------------------------------------

             Summary: Improve and promote sampling algorithms to top level 
modules
                 Key: MADLIB-1156
                 URL: https://issues.apache.org/jira/browse/MADLIB-1156
             Project: Apache MADlib
          Issue Type: Improvement
          Components: Module: Sampling
            Reporter: Frank McQuillan
             Fix For: v2.0


Story

As a MADlib user, I want to sample a data table using the different techniques 
and distributions described in references [1] and [2], so that I can do model 
building using the sampled data sets.  Also, I want to ensure that these 
algorithms are properly documented and tested.  

Candidate for 2.0 since may involve interface changes

Acceptance

1) Are these algorithms ready to promote from software quality perspective?
2) Define and implement any interface changes required
3) Define and implement any IC and Tinc tests required.
4) Write documentation and provide examples.

References

[1] Existing MADlib sample function
http://madlib.incubator.apache.org/docs/latest/group__grp__sample.html

[2] Other MADlib sample functions
http://madlib.incubator.apache.org/docs/latest/sample_8sql__in.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to