Frank McQuillan created MADLIB-1156:
---------------------------------------
Summary: Improve and promote sampling algorithms to top level
modules
Key: MADLIB-1156
URL: https://issues.apache.org/jira/browse/MADLIB-1156
Project: Apache MADlib
Issue Type: Improvement
Components: Module: Sampling
Reporter: Frank McQuillan
Fix For: v2.0
Story
As a MADlib user, I want to sample a data table using the different techniques
and distributions described in references [1] and [2], so that I can do model
building using the sampled data sets. Also, I want to ensure that these
algorithms are properly documented and tested.
Candidate for 2.0 since may involve interface changes
Acceptance
1) Are these algorithms ready to promote from software quality perspective?
2) Define and implement any interface changes required
3) Define and implement any IC and Tinc tests required.
4) Write documentation and provide examples.
References
[1] Existing MADlib sample function
http://madlib.incubator.apache.org/docs/latest/group__grp__sample.html
[2] Other MADlib sample functions
http://madlib.incubator.apache.org/docs/latest/sample_8sql__in.html
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)