Hi there, I have created a table of numbers using clustered by and am sampling it using buckets.
If I am selecting 10000 candidates from ~125m how can I get good random selections? Should I create 12500 clusters? Or should I create 100 clusters and then use the sample function (... from 12500) ? Simon