Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r15534415
--- Diff:
core/src/main/scala/org/apache/spark/util/random/SamplingUtils.scala ---
@@ -88,14 +91,73 @@ private[spark] object SamplingUtils {
*/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50521377
QA tests have started for PR 1025. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17368/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50527473
QA results for PR 1025:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50528615
LGTM. Merged into master. Thanks!!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/1025
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
GitHub user dorx reopened a pull request:
https://github.com/apache/spark/pull/1025
[SPARK-2082] stratified sampling in PairRDDFunctions that guarantees exact
sample size
Implemented stratified sampling that guarantees exact sample size using
ScaRSR with two passes over the RDD
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50379743
QA tests have started for PR 1025. This patch DID NOT merge cleanly!
brView progress:
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50380482
QA tests have started for PR 1025. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17296/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50381620
QA results for PR 1025:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50383895
QA tests have started for PR 1025. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17300/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50388808
QA results for PR 1025:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50393886
QA results for PR 1025:br- This patch FAILED unit tests.brbrFor more
information see test
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50409833
QA tests have started for PR 1025. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17307/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50414198
QA results for PR 1025:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50290028
@dorx I removed commons-math3 from dependencies, separated `sampleByKey`
and `sampleByKeyExact`, and corrected the math in waitlisting in sampling with
replacement.
Github user dorx closed the pull request at:
https://github.com/apache/spark/pull/1025
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50065418
QA tests have started for PR 1025. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17130/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50070185
QA results for PR 1025:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds the following public classes
(experimental):brcase class
Github user dorx commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50071767
Looks like there's some API changes from Xiangrui's updates. @mateiz
@pwendell
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user dorx commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50073867
Also, seems like there wasn't a single line of code preserved from before
the updates. We should probably close this PR and let Xiangrui submit his
version in a separate PR
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50101767
Sorry, how was the API changed, was it making `sampleByKeyExact` a separate
method and making it experimental? That actually seems okay to me, the
algorithm there is
Github user falaki commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50104259
This is the first place we introduce 'exact' to our API. We already have
'approx' in function names. I think having both of them is confusing to users.
---
If your
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-50109108
Well, the other sample functions are already approximate anyway. I kind
of like this here because it conveys that it's more expensive. The other thing
is that if we want
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14906065
--- Diff: pom.xml ---
@@ -257,6 +257,11 @@
version1.5/version
/dependency
dependency
+
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14906349
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -195,6 +193,37 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14906412
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -195,6 +193,37 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14906680
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14906754
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14906825
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14906919
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907155
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907202
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907335
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907358
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -195,6 +193,37 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907544
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -195,6 +193,37 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907579
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907639
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907668
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907670
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907687
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907870
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14907896
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48985401
QA tests have started for PR 1025. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16650/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48989787
QA results for PR 1025:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14672589
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,335 @@
+/*
+ * Licensed to the Apache Software
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48384891
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48384908
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48386184
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16414/
---
If your project is set up for it, you can
Github user dorx commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48386518
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48386790
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48386807
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48388111
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16416/
---
If your project is set up for it, you can
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48388110
Merged build finished.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48414125
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48414132
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user falaki commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14688121
--- Diff:
core/src/main/scala/org/apache/spark/util/random/SamplingUtils.scala ---
@@ -45,11 +50,75 @@ private[spark] object SamplingUtils {
val
Github user falaki commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14688338
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,310 @@
+/*
+ * Licensed to the Apache Software
Github user falaki commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14688363
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,310 @@
+/*
+ * Licensed to the Apache Software
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14688550
--- Diff:
core/src/main/scala/org/apache/spark/util/random/SamplingUtils.scala ---
@@ -45,11 +50,75 @@ private[spark] object SamplingUtils {
val
Github user falaki commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14688585
--- Diff:
core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
@@ -83,6 +83,120 @@ class PairRDDFunctionsSuite extends FunSuite with
Github user falaki commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14688624
--- Diff:
core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
@@ -83,6 +83,120 @@ class PairRDDFunctionsSuite extends FunSuite with
Github user falaki commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14688613
--- Diff:
core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
@@ -83,6 +83,120 @@ class PairRDDFunctionsSuite extends FunSuite with
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48416874
All automated tests passed.
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16431/
---
If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48416873
Merged build finished. All automated tests passed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well.
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14688633
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,310 @@
+/*
+ * Licensed to the Apache Software
Github user falaki commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14688702
--- Diff:
core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
@@ -83,6 +83,120 @@ class PairRDDFunctionsSuite extends FunSuite with
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48418905
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48418912
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user dorx commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48419179
Holding out on updating the docs until the python version is supported.
For the python version, any objections to using _jrdd to invoke the java
version of sampleByKey?
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48419506
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16439/
---
If your project is set up for it, you can
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48419505
Merged build finished.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user dorx commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48419578
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48419772
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48419784
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48422316
Merged build finished. All automated tests passed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1025#issuecomment-48422320
All automated tests passed.
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16441/
---
If your
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14691480
--- Diff: pom.xml ---
@@ -257,6 +257,11 @@
version1.5/version
/dependency
dependency
+
Github user dorx commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14691577
--- Diff: pom.xml ---
@@ -257,6 +257,11 @@
version1.5/version
/dependency
dependency
+
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694237
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -195,6 +193,37 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694233
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
---
@@ -130,6 +130,38 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
new
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694234
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
---
@@ -130,6 +130,38 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
new
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694253
--- Diff:
core/src/main/scala/org/apache/spark/util/random/SamplingUtils.scala ---
@@ -45,11 +50,78 @@ private[spark] object SamplingUtils {
val
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694262
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694258
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694259
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694250
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -195,6 +193,37 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694252
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -195,6 +193,37 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694251
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -195,6 +193,37 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694248
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -195,6 +193,37 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694254
--- Diff:
core/src/main/scala/org/apache/spark/util/random/SamplingUtils.scala ---
@@ -45,11 +50,78 @@ private[spark] object SamplingUtils {
val
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694267
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694277
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694296
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694292
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694294
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694274
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694285
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694278
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694281
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1025#discussion_r14694283
--- Diff:
core/src/main/scala/org/apache/spark/util/random/StratifiedSampler.scala ---
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software
1 - 100 of 220 matches
Mail list logo