Stratified sampling with DataFrames

Karthikeyan Muthukumar Mon, 11 May 2015 12:34:02 -0700

Hi,
I'm in Spark 1.3.0 and my data is in DataFrames.
I need operations like sampleByKey(), sampleByKeyExact().
I saw the JIRA "Add approximate stratified sampling to DataFrame" (
https://issues.apache.org/jira/browse/SPARK-7157).
That's targeted for Spark 1.5, till that comes through, whats the easiest
way to accomplish the equivalent of sampleByKey() and sampleByKeyExact() on
DataFrames.
Thanks & Regards
MK

Stratified sampling with DataFrames

Reply via email to